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PROTEASE VARIANTS 

Field of the Invention 

5 The present invention relates to variants of a parent protease, in particular variants of 

amended properties, such as improved thermostability and/or amended temperature activity 
profile. The invention also relates to DNA sequences encoding such variants, their production 
in a recombinant host cell, as well as methods of using the variants, in particular within the 
field of animal feed and detergents. The Invention furthermore relates to methods of 
10 generating and preparing protease variants* of amended properties- Preferred parent 
proteases are the Nocard/opsfe proteases comprising SEQ ID NOs; 2, 4, 6 t 8 and 10. 

0 

' Background of the Invention 

Proteases derived from strains of Nocardiopsis are disclosed in WO 88/03947, WO 
15 01/58276, and DK 1996 00013 ("Protease 10/* SEQ ID NOs: 1-2); DK 2003 00912 ("Protease 
. 08." SEQ ID- NOs: 9-10); DK 2003 00913 ("Protease 11," SEQ ID NOs: 5-6); DK 2003 00914 
("Protease 18," SEQ ID NOs: 3-4): DK 2003 00915 ("Protease 35." SEQ ID NO: 7-8), DD 
2004328, and in JP 02255081 . 

It is an object of the present invention to provide novel and improved protease variants, 
20 in particular of amended properties, such as improved thermostability and/or a higher or lower 
optimum temperature. , • ' . 

« 

j 

Summary of the Invention , « 

» * 

The present invention relates to a variant of a parent protease, comprising a 
25 substitution in at least one position of at least one region selected from the group of regions 
* consisting of: 6-18; 22-28; 32-39; 42-58; 62-63; 66-76; 78-100; 103-106; 1.11-114; 118-131; 
134-136; 139-141; 144-151; 155-156; 160-176; 179-181; and 184-188; wherein 

(a) the variant has protease activity; and 

(b) each position corresponds to a position of SEQ ID NO: 2; and 

30 (c) the variant has a percentage of identity to SEQ ID NO: 2 of at least 60%. 

The present invention also relates, to isolated nucleic acid sequences encoding the 

• • • 

protease variant and to nucleic acid constructs, vectors, and host cells comprising the nucleic 
acid sequences as well as methods for producing and using the protease variants, 

35 Brief Description of the Figures 

Figure 1 is a multiple alignment of Protease 10, Protease 18, Protease 11, Protease 35 
and Protease 08 (the mature peptide parts of SEQ ID NOs: 2, 4 t 6, 8 and 10, respectively) and 
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of a protease variant of the invention, vte. Protease 22 (amino acids 1-188 of SEQ ID NO: 12); 
and 

■ * 

Figure 2 provides the coordinates of the novel 3D structure of Protease 10 {SEQ ID 
NO: 2) derived from Nocardiopsis sp. NRRL 18262. 



Detailed Description of the Invention 
Three-dimensional Structure of Protease 10 

The structure of Protease 10 was solved in accordance with the principles for X-ray 
crystallographlc methods as given, for example, in X-Ray Structure Determination, Stout, G.K. 
10 and Jensen, L.H., John Wiley & Sons, Inc. NY ( 1989. The structural coordinates for the crystal 
structure at 2.2 A resolution using the isomorphous replacement method are given in Fig. 2 in 
standard PDB format (Protein Data Bank, Brookhaven National Laboratory, Brookhaven, CT). 
The PDB file of Fig. 2 relates to the mature peptide part of Protease 10 corresponding to 
residues 1-188 of SEQ ID NO: 2. \ ' 



Molecular Dynamics (MD) ; 6 

Molecular Dynamics (MD) simulations are indicative of the mobility of the amino acids 
in a protein structure {see McCammon, JA arid Harvey, SC., (1987), "Dynamics of proteins 
and nucleic acids", Cambridge University Press). Such protein dynamics are often compared 

20 to the crystallographic 8-factors (see Stout, GH and Jensen, LH, (1989), "X-ray structure 
determination", Wiley). By running the MD simulation at, e.g., different temperatures, the 
temperature related mobility of residues is simulated. Regions having the highest mobility or 
flexibility (here isotropic fluctuations) may be suggested for random mutagenesis. It is here 
understood that the high mobility found in certain areas of the protein, may be thermally 

25 improved by substituting these residues, 

Using the programs CHARMM {Accelrys) and NAMD (University of Illinois at Urfaana- 
Champaign) the Protease 10 structure described above was subjected; to MD at 300 and 

• - 

400K. Starting from the coordinates of Figure.2 hydrogen and missing heavy atoms were built 
using CHARMM procedures HBUILD and IC BUILD respectively. Then the structure was 

30 minimized using CHARMM Conjugate Gradients (CONJ) minimization procedure for a total of 
200 steps. The protein was then put on a 70 X 70 X 70 Angstrom box and solvated with TIP3 
water molecules. A total of 11124 water molecules were added and then iriinimized, keeping 
the protein coordinates fixed, using CHARMM Adopted Basis Newton \ Raphson (ABNR) 
minimization procedure for 20000 steps. The system was then heated to the desired 

35 temperature at a rate of 1K every 100 steps using the NAMD software. After an equilibration of 
50 picoseconds, an NVE ensemble MD was run for 1 nanosecond, both steps done with the 
software NAMD. A cut-off of 12 Angstrom was used for the non-bonded interactions. Periodic 
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boundary conditions were used after the solvation step and for alt the subsequent ones. The 
isotropic root mean square (RMS) fluctuations were calculated with the CJHARMM procedure 

COOR DYNA. .i 

>•' 

The following suggested regions for mutagenesis result from MD simulations: From 
5 residue 160 to 170, from residue 78 to 30, from residue 43 to 50, from residue 66 to 75, and 
from residue 22 to 28. , 

r 

Strategy for Preparing Variants 

Regions of amino acid residues, as well as individual amino acid substitutions, were 
10 suggested for mutagenesis based on the 3D-structure of Fig. 2 and the alignment of Fig. 1, 
mainly with a view to improving thermostability. 

The following regions were suggested, cf. claim 1: 6-18; 22-28; 32-39; 42-58; 62-63; 
66-76; 78-100; 103-106; 111-114; 118-131; 134-136; 139-141; 144-151; J55-156; 160-176; 
179-181; and 184-188. . 
15 At least one of the following positions of the above regions are preferably subjected to 

mutagenesis, cf. claim 3: 6; 7; 8; 9; 10; 12; 13; 16; 17; 18; 22; 23; 24; 25; Zf* 27; 28; 32; 33; 
37; 38; 39; 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52; 53; 54; 55; 56; 58; 62j$3; 66; 67; 68; 69; 
70; 71; 72; 73; 74; 75; 76; 78; 79; 80; 81; 82; 83; 84; 85; 86; 87; 88; 89; 90; ; 91; 92; 93; 94; 95; 
96; 97; 98; 99; 100; 103; 105; 106; 111;:113; 114; 118; 120; 122; 124; 125;< 127; 129; 130; 
20 131; 134: 135; 136; 139; 140; 141; 144M45; 146; 147; 148; 149; 150; 151; 155; 156; 160; 
161; 162; 163; 164; 165; 166; 167; 168;169; 170; 171; 172; 173; 174; 175; 176; 179; 180; 
181; 184; 185; 186; 187; and/or 188. 

Contemplated specific variants are listed in the claims, viz. variants of Protease 10, 
Protease 18, Protease 11, Protease 35 as well as Protease 08 in claims 4 and 15; variants of 
25 Protease 10 in claim 16; variants of Protease 18 in claim 17; variants of Protease 11 in claim 
18; variants of Protease 35 in claim 19;iand variants of Protease 08 in claim 20. Subgroups of 
specific variants are listed in claims 21-23. 

The various concepts underlying the invention are also reflected in the claims as 
follows: Stabilization by disulfide-bridges in claims 5 and 6; proline-stabilization in claims 7-8; 
30 substitution of exposed neutral residues with negatively charged residues In claims 9-10; 
substitution of exposed neutral residues with positively charged residues in claims 11-12; 
substitution of small residues with bulkier residues inside the protein in claim 13; and regions 
proposed for mutagenesis following MD simulations In claim 14. :i ; } 

The term "at least one" means "one or more," viz., e.g. in the context of regions: One, 

35 two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, 

» 

sixteen, or seventeen; or, in the context of positions or substitutions: One, two, three, four, 
five, and so on, up to e.g. ninety. 

3 
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In a particular embodiment, the , number of regions proposed for and/or subjected to 
mutagenesis is at least one, two. three,four, five, six, seven, eight, nine, ;ten, eleven, twelve, 
thirteen, fourteen, fifteen, sixteen, or at l^ast seventeen. 

In another particular embodiment, the number of regions proposed f£r and/or subjected 
5 to mutagenesis is no more than one, twp, three, four, five, six, seven, eighty nine, ten, eleven, 
twelve, thirteen, fourteen, fifteen, sixteen, or no more than seventeen. 

* - 

Polypeptides Having Protease Activity . j 

Polypeptides having protease activity, or proteases, are sometimes also designated 
10 peptidases, proteinases, peptide hydrolases, or proteolytic enzymes. Proteases may be of the 
exo-type that hydrolyse peptides starting at either end thereof, or of the endo-type that act 
internaliy in polypeptide chains (endopeptidases). Endopeptidases show activity on N- and C- 
terminally blocked peptide substrates that are relevant for the specificity of the protease in 
question. 

15 The term "protease" is defined herein as an enzyme that hydrolyses peptide bonds. 

This definition of protease also applies to the protease-part of the terms "parent protease" and 

to _ \ 

"protease variant,* as used herein. The term "protease" includes any enzyme belonging to the 
EC 3.4 enzyme group (including each ;of the thirteen subclasses thereof}. The EC number 
refers to Enzyme Nomenclature 199j2 from NC-IUBMB, Academip [j>ress, San Diego, 

20 California, including supplements 1-5 published in Eur. J. Bio-chenr 199$, 223, 1-5; Eur. J. 
Biochem. 1995, 232, 1-6; Eur. J. Biochpm. 1999. 237. 1-5; Eur, J. Biochehi. 1997, 250, 1-6; 
and Eur. J. Biochem. 1999. 264, 610-650; respectively. The nomenclature is regularly 
supplemented and : updated; see. e.g. the World Wide Web (WWW) at 
httpy/wvw.<^em.qmw.ac.uk^ubmb/enz\ane/index»html. 

25 Proteases are classified on the basis of their catalytic mechanism into the following 

groups: Serine proteases <S), Cysteine proteases (C), Aspartic proteases (A), Metallo prote- 
ases (M), and Unknown, or as yet unclassified, proteases (U), see Handbook of Proteolytic 
Enzymes, A.J.Barrett, N.D.Rawlings, J.RWoessner (eds). Academic Press (1998), in particu- 
lar the general introduction part. 

30 In particular embodiments, the parent proteases and/or the protease variants of the 

invention and for use according to the invention are selected from the group,consisting of: 

(a) Proteases belonging to the EC 3.4,-.- enzyme group; j/jj 

(b) Serine proteases belonging to the S group of the above Handbook; 
(d) Serine proteases of peptidase family S2A; and . ; $ 

35 (c2) Serine proteases of peptidase family S1E as described in ^pchem.J. 290:205- 

218 (1993) and in MEROPS protease database, release 6.20, < [March 24,- 2003, 

» 

(www.merops.ac.uk). The database is described in Rawlings, N.D., O'Brif h, E. A. & Barrett, 
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A.J. (2002) MEROPS: the protease database. Nucleic Acids Res. 30, 343-346. 

For determining whether a given protease is a Serine protease, and a family S2A pro- 
tease, reference is macie to the above * Handbook and the principles indicated therein. Such 
determination can be carried out for all types of proteases, be it naturally occurring or wild-type 
5 proteases; or genetically engineered or Synthetic proteases. 

a 

Protease activity can be measured using any assay, in which a substrate is employed, 

■ * • * 

that includes peptide bonds relevant fo* the specificity of the protease 'in [question. Assay-pH 

■ r j • 

and assay-temperature are likewise to be adapted to the protease in tjuestion. Examples of 
assay-pH-values are pH 2. 3, 4, 5, 6, 7| 8, 9, 10, 11, or 12, Examples of afssay-temperalures 
10 are 30, 35, 37. 40, 45, 50, 55, 60, 65, 70, 80, 90, or 95°C, Examples of 'pfotease substrates 
are casein, such as Azurine-Crossfinked Casein (AZCL-casein). Examples of suitable 
protease assays are described in Example 1 , 

Parent Protease 

15 The parent protease is a protease from which the protease variant is, or can be, 

derived. For the present purposes, any protease can be used as the parent protease, as long 
as the resulting protease variant is homologous to Protease 10, i.e. the protease derived from 
Nocardiopsis sp. NRRL ! 18262 and comprising amino adds 1-188 of SEQ ID NO: 2. 

In a particular embodiment the parent protease is also homologous; to Protease 10. 

20 In the present context, homologous means having an identity of at least 60% to SEQ 

* * 

ID NO: 2, viz. amino acids 1-188 of tlje mature peptide part of Protea^ 10. Homology is 
determined as generally described below In the section entitled Amino Acid jHomology. .» 

The parent protease may be a wild-type or naturally occurring polypeptide, or ah allelic 
variant thereof, or a fragment thereof that has protease acticity, in partic&lar a mature part 
25 thereof. It may also be a variant thereof and/or a genetically engineered or synthetic 
polypeptide. 

In a particular embodiment the wild-type parent protease is i) a bacterial protease; (i) a 
protease of the phylum Actinobacteria: iii) of the class Actinobacteria; iv) of the order 
Actinomycetales v) of the family Nocardiopsaceae; vi) of the genus Nocardiopsis; and/or a 

30 protease derived from vil) Nocardiopsis species, such as Nocardiopsis alba, Nocardiopsis 
antarctica, Nocardiopsis composta, Nocardiopsis dassonviflei, Nocardiopsis exhalans, 
Nocardiopsis halophila, Nocardiopsis haiotolerans, Nocardiopsis kunsanensis, Nocardiopsis 
listen, Nocardiopsis lucentensis, Nocardiopsis mataliicus, Nocardiopsis prasina, Nocardiopsis 
sp., Nocardiopsis synnemataformans, Nocardiopsis trehalose Nocardiopsis tropica, 

35 Nocardiopsis umidischolaa, or Nocardiopsis xinjiangensis. 

m 

t 

Examples of such strains are: Nocardiopsis alba DSM 15647 (wftd4ype producer of 
Protease 08), Nocardiopsis dassonviJIel NRRL 18133 (wild-type producenjof Protease M58-1 

■ s • ft 
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described in WO 88/03947), Nocardiosis dassonvMei subsp. dassonvillei DSM 43235 (wild- 
type producer of Protease 18), Nocardiopsis prasina DSM 15648 (wild-type producer of 
Protease 11). Nocardiopsis prasina DSM 15649 (wild-type producer of Protease 35), 
Nocardiopsls sp. NRRL 18262 (wild-type producer of Protease 10), Nocardiopsis sp. strain 

5 FERM P-10508 (described in JP 02255681), or Nocardiopsis dassonvillei strain ZIMET 43647 
(described In DD 2,004,328). 

Strains of these species are accessible to the public in a number of 1 fculture collections, 
such as the American Type Culture Collection (ATCC), Deutsche; Sammlung von 
Mikroorganismen und Zellkulturen Gm'bH (DSMZ), Centraalbureau Vo^SchimmelduUures 

10 (CBS), and Agricultural Research Service Patent Culture CollecUdri, 'Northern Regional 
Research Center (NRRL) f e.g. Nocardiopsis dassonvillei subsp. dassonWllei DSM 43235 is 
publicly available from DSMZ (Deutsche Sammlung von Mikroorganismen und Zetlkulturen 
GmbH, Braunschweig, Germany). The strain was also deposited St other depositary 
institutions as follows: ATCC 23219, IMRU 1250, NCTC 10489. 

15 The following biological materials were deposited in connection with the Wing of other 

patent applications under the terms of the Budapest Treaty with the Agricultural Research 
Culture Collection (NRRL), Peoria, US, or the Deutsche Sammlung von Mikroorganismen und 
Zellkulturen GmbH, Mascheroder Weglb, D-38124 Braunschweig, and given the following 
accession numbers: 

* 

20 Deposit Accession Number Date of Deposit 

Nocardiopsis sp. NRRL 18262 November 10, 1987 

Nocardiopsis dassonvillei NRRL 1 81 33 November 1 3, 1 986 

Nocardiopsis alba DSM 15647 May 30, 2003 

Nocardiopsis prasina DSM 15648 tfjay 30, 2003! 

25 Nocardiopsis prasina DSM 15649 j ; ffoay 30, 2003 

These strains have been deposited under conditions that assure! /.{hat access to the 
cultures will be available during the .pendency of the other patent Applications to one 
determined by the Commissioner of Patents and Trademarks to be entitled thereto under 37 
C.F.R. §1,14 and 35 U.S.C. §122. The deposits represent substantially pure cultures of the 

30 deposited strains. The deposits are available as required by foreign patent laws in countries 
wherein counterparts of these applications, or their progeny are filed. However, it should be 
understood that the availability of a. deposit does not constitute a license to practice the 
invention in derogation of patent rights granted by governmental action. 

Of the above strains, the three last-mentioned were isolated in 2001 from soil samples 

35 from Denmark. 

■ 

Furthermore, such polypeptides) may be identified and obtained from other sources 
including microorganisms or DNA isolated from nature (e.g., soil, composts; water, etc.) using 
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suitable probes. Techniques for isolating microorganisms or DNA from natural habitats are 
well known in the art. The nucleic acid sequence may then be derived by similarly screening a 
genomic or cDNA library of another microorganism. Once a nucleic acid sequence encoding a 
polypeptide has been detected with the probe(s), the sequence may be isolated or cloned by 
5 utilizing techniques which are known to those of ordinary skill in the art (see, e.g., Sambrook et 
a/., 1989, supra). 

The parent protease may be a mature part of any of the amino acid sequences 
referred to above. A mature part means a mature amino acid sequence and refers to that part 
of an amino acid sequence which remains after a potential signal peptide part and/or pro- 
10 peptide part has been cleaved off. 

The parent protease may also be a fragment of a specified amino acid sequence, viz. a 
polypeptide having one or more amino acids deleted from the amino and/pr carboxyl terminus 
of this amino acid sequence. In one embodiment, a fragment contains atjleast 80, or at least 
90, or at least 100, or at least 1 10, or at least 120, or at least 130, or at least 140, or at least 

15 150, or at least 160, or at least 170, or at least 1 80, or at least 185 amino acid residues. 

■i * 

The parent protease may also t}e an allelic variant, allelic referring to the existence of 
two or more alternative forms of a gene occupying the same chromosomal locus. Allelic 
variation arises naturally through mutation, and may result in polymorphism within populations. 
Gene mutations can be silent {no change in the encoded polypeptide) or may encode 

20 polypeptides having altered amino add sequences. An allelic variant of a polypeptide is a 
polypeptide encoded by an allelic variant of a gene. 

In another embodiment, the parent protease may be a genetically engineered 
protease, e.g. a variant of the wild-type or natural parent proteases referred to above 
comprising a substitution, deletion, and/or insertion of one or more amino acids. In other 

25 words: The parent protease may itself be a protease variant, such as Protease 22. The amino 

* * . ■ 

acid sequence of such parent protease »may differ from the amino acid sepyence specified by 
an insertion or deletion of one or more, amino acid residues and/or the substitution of one or 
more amino acid residues by different amino acid residues. The amino aqid changes may be 
of a minor, or of a major, nature. Amino acid changes of a major nature are e.g. those 

30 resulting in a variant protease of the present invention with amended properties. In another 
particular embodiment^ the amino acid; changes are of a minor nature, that Is conservative 
amino acid substitutions that do not significantly affect the folding and/or activity of the protein; 
small deletions, typically of one to about 30 amino acids; small amino- or carboxyl-terminal 
extensions, such as an amino-terminal methionine residue; a small linker peptide of up to 

35 about 20-25 residues; or a small extension that facilitates purification by changing net charge 
or another function, such as a poly-histidine tract, an antigenic epitope or a binding domain. 

Examples of conservative substitutions are within the group of basic amino acids 
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(arginine, lysine and bistidine), acidic amino acids (glutamic acid and a^partic acid), polar 
amino acids (glutamine and asparagine), hydrophobic amino acids (Ieu$ne, isoleucine and 
valine), aromatic amino acids (phenylalanine, tryptophan and tyrosine), and) small amino acids 
(glycine, alanine, serine, threonine and methionine). Amino acid substitutions which do not 
5 generally alter the specific activity are known in the art and are described, for example, by H. 
Neurath and R.L Hill, 1979, In, The Proteins, Academic Press, New York. The most 
commonly occurring exchanges are A»a/Ser f Val/He, Asp/Glu, Thr/Ser, Ala/Gly, Ala/Thr, 
Ser/Asn, Ala/Val f Ser/Gly, Tyr/Phe. Ala/Pro. Lys/Arg, Asp/Asn, Leu/lle, LeuA/al, Ala/Glu, and 
Asp/Gly as well as these in reverse. 
10 Still further examples of genetically engineered parent proteases are synthetic 

proteases, designed by man, and expectedly not occurring in nature. EP 897985 discloses a 
process of preparing a consensus protein. Shuffled proteases are other examples of synthetic 
or genetically engineered parent proteases, which can be prepared as is ; generally known in 

* 

the art, eg by Site-directed Mutagenesis, by PGR (using a PCR fragment containing the 

15 desired mutation as one of the primers in the PCR reactions), or by Random Mutagenesis. 
Included in the concept of a synthetic protease is also any hybrid or chimeric protease, i.e. a 
protease which comprises a combination of partial amino acid sequences # rived from a * ,east 
two proteases. Gene shuffling is generally described in e.g. WO 95/22625*and WO 96/00343. 
Re-combination of protease genes can be made independently of the specific sequence of the 

20 parents by synthetic shuffling as described in Ness, J.E. et a!, in Nature Biotechnology, Vol. 20 
(12), pp. 1251-1255, 2002, Synthetic oligonucleotides degenerated in their DNA sequence to 
provide the possibility of all amino adds found in the set of parent proteases are designed and 
the genes assembled according to the reference. The shuffling can be carried out for the full 
length sequence or for only part of the sequence and then later combined with the rest of the 

25 gene to give a full length sequence. Two, three, four, five or all six of the the proteases 
designated Protease 1,0, 18. 11, 35, 08 and 22 (SEQ ID NOs: 2, 4, 6, 8. 10, and 12) are 
particular examples of such parent proteases which can be subjected to shuffling as described 
above, to provide additional proteases of the invention. 

In further particular embodiments, the parent protease comprises, or consists of> 

30 respectively, the amino acid sequence specified, or an allelic variant thereof; or a fragment 
thereof that has protease activity. ■> « ; |m 

In still further particular embodiments,, the protease variant of t^e invention Is not 
identical to: , \ 'u 

(i) amino acids 1-188 of SEQ ID NO: 2, amino acids 1-188 of SEQ ID NO: 4, amino 
35 acids 1-188 of SEQ ID NO: 6. amino acids 1-188 of SEQ ID NO: 8 t and amino acids 1-188 of 

SEQ ID NO: 10; , 

(ii) the protease derived from Nocardiopsis dassonvillei NRRL 18133; 

8 
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(iii) the protease derived from Nocardiopsis sp. FERM P-10508; 

(iv) the protease derived from Nocardiopsls dassonvlUei strain ZIMCft 43647; anfc/or 

(v) any prior art protease of a percentage of identity to SEQ ID NO: r 2 of at least 60%. 

* ' . Mi 

H\ » 

5 Microorganism Taxonomy ip 

Questions relating to taxonomy 'may be solved by consulting a taxonomy data base, 

• i.< ■ 

such as the NCBi Taxonomy Browser which is available at the following internet site: 
http://www.ncbi,nfm.nih.gov/Taxonomy/taxononiyhome.html/ f and/or by consulting Taxonomy 
handbooks. For the present purposes* the taxonomy is preferably according to the chapter. 
10 The road map to the Manual by G.M Garrity & J. G. Holt in Bergey's Manual of Systematic 
Bacteriology, 2001, second edition, volume 1. David R. Bone, Richard W. Castenholz. 

Amino Acid Homology 

The present invention refers to proteases, viz. parent proteases, and/or protease 
15 variants, having a certain degree of identity to SEQ ID NO: 2. such parent and/or variant 
proteases being hereinafter designated ^homologous proteases". 

For purposes of the present invention the degree of identity between two amiho add 
sequences, as well as the degree of identity between two nucleotide sequeftces.is determined 
by the program "align" which is a Needfeman-Wunsch alignment (i.e. a global alignment). The 
20 program is used for alignment of polypeptide, as well as nucleotide sequences. The default 
scoring matrix BLOSUM50 is used for polypeptide alignments, and the default identity matrix is 

n 

* f 

used for nucleotide alignments. The! penalty for the first residue of -b gap is -12 for 
polypeptides and -16 for nucleotides, the penalties for further residues of a gap are -2 for 
polypeptides, and -4 for nucleotides. 

25 "Align" is part of the FASTA package version v20u6 (see W. R. Pearson and p. J. 

Lipman (1988), "Improved Tools for Biological Sequence Analysis", PNAS 85:2444-2448. and 
W. R. Pearson (1990) "Rapid and Sensitive Sequence Comparison with FASTP and FASTA," 
Methods in Enzymology 183:63-98). FASTA protein alignments use the Smith-Waterman 
algorithm with no limitation on gap size , (see "Smith-Waterman algorithm", T. F. Smith and M. 

30 S. Waterman (1 98 1 ) J. Mol. Biol. 1 47: 1 95-1 97). 

Multiple alignments of protein sequences may be made using "ClustalW" (Thompson, 
J.D., Higgins, D.G. and Gibson, TJJ (1994) CLUSTAL W: improving! the sensitivity of 

» 

progressive multiple sequence alignment through sequence weighting; positions-specific gap 
penalties and weight matrix choice.: Nucleic Acids Research, 22:4J373-4680). Multiple 
35 alignment of DNA sequences may be done using the protein alignmk/U as a template, 
replacing the amino acids with the corresponding codon from the DNA sequence. . 

In particular embodiments, the homologous protease has an amino acid sequence 

9 
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which has a degree of identity to SEQ ID NO: 2 of at least 60%, 62%, 64%, 66%, 68%, 70%. 
72%, 74%, 76%, 78%,» 80%, 82%, 84%, 86%, 88%, 90%, 92%. 94%, 96%, 98%, or of at least 
about 99%. * 

In alternative embodiments, the homologous protease has an amino acid sequence 
which has a degree of identity to SEQ ID NO: 2 of at least 50%, 51%, 52$, 53%, 54%, 55%. 
56%, 57%, 58%, or at least 59%. i . !ft * 

In another particular embodiment, the parent protease, and/or ttfe protease variant, 
comprises a mature amino acid sequence which differs by no mor&ithan seventyfive. 
seventyfour, seventythree, seventytwo,;seventyone. seventy, sixtynine; sixtyeight. sixtyseven, 
sixtyslx, sixtyflve, sixtyfour, sixtythree, sixtytwo, sixtyone, sixty, fiftynine. fiftyeight, fiftyseven, 
fiftysix, fiftyfive, fiftyfour, fiftythree, fiftytwo, fiftyone, fifty, fortynine, fortyeight, fortyseven, 
fortysix, fortyfive, fortyfour, fortythree, fortytwo, fortyone, forty, thirtynine, thirtyeight, 
thirtyseven, thirtysix, thirtyfive, thirtyfour. thirtythree, thirtytwo, thirtyone, thirty, twentynine, 
twentyeight, twentyseven, twentysix, twentyfive, twentyfour, twentythree, twentytwo, 
twentyone, twenty, nineteen, eighteen, seventeen, sixteen, fifteen, fourteen, thirteen, twelve, 
eleven, ten, nine, eight, seven, six, five, four, three, by no more than two, or only by one amino 
acid(s) from the specified amino acid sequence, e.g. SEQ ID NO: 2. 

In a still further particular embodiment, the parent protease, and/or the protease 
variant, comprises a mature amino acid sequence which differs by at; least seventyfive, 
seventyfour, seventythree, seventytwo, iseventyone, seventy, sixtynine, si#tyeight, sixtyseven, 
sixtysix, sixtyfiye, sixtyfour, sixtythree, sixtytwo, sixtyone, sixty, fiftynine, fjftyeight, fiftyseven, 
fiftysix, (iftyfive, fiftyfour, fiftythree, fiftytwo, fiftyone, fifty, fortynine. : fojjyeight. fortyseven, 
fortysix, fortyfive, fortyfour, fortythrep. fortytwo. fortyone, forty, thjj^ynine, thirtyeight, 
thirtyseven, thirtysix, thirtyfive, thirtyfour, thirtythree, thirtytwo, thirtyone^thirty, twentynine, 
twentyeight, twentyseven, twentysix, twentyfive, twentyfour, twentythree, twentytwo, 
twentyone, twenty, nineteen, eighteen, seventeen, sixteen, fifteen, fourteen, thirteen, twelve, 
eleven, ten, nine, eight, seven, six, five, four, three, by at least two, or by one amino acid(s) 
from the specified amino acid sequence, e.g. SEQ ID NO: 2. 

Nucleic Acid Hybridization 

In the alternative, homologous parent proteases, as well as variant proteases, may be 
defined as being encoded by a nucleic acid sequence which hybridizes under very low 

■ 

stringency conditions, preferably low stringency conditions, more preferably medium 
stringency conditions, more preferably medium-high stringency conditions, even more 
preferably high stringency conditions, and most preferably very high string^pcy conditions with 
nucleotides 900-1466 of SEQ ID NO: 1,?or a subsequence or a complementary strand .thereof 
(J. Sambrook, E.F. Frltsch, and T. Manlatus, 1989, Molecular Cloning, A^aboratory Manual, 

5 10 it'! 
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2d edition, Cold Spring Harbor, New York). A subsequence may be at least 100 nucleotides, 
or at least 200, 300, 400, or at least 500 nucleotides. Moreover, the subsequence may encode 
a polypeptide fragment that has the relevant enzyme activity. 

For long probes of at least 1 00 nucleotides in length, very low to very high stringency 
5 conditions are defined] as prehybridizatlon and hybridization at 42°C in 5X SSPE. 0.3% SDS, 
200 jig/ml sheared and denatured salmon sperm DNA, and either 25% formamide for very low 
and tow stringencies, 35% formamide for medium and medium-high stfifngencies, or 50% 
formamide for high and very high , stringencies, following slandarctjjSouthern plotting 
procedures. \ \ii 

■ - 9 ' 

10 For long probes of at least 100 nucleotides in length, the carried material Is finally 

washed three times each for 15 minutes using 2 x SSC, 0.2% SDS preferably at least at 45°C 
(very low stringency), more preferably at least at 50°C (low stringency), /; more preferably at 
least at 55°C (medium stringency), more preferably at least at 60°C (medium-high stringency), 
even more preferably at least at 65°C (high stringency), and most preferably at least at 70°C 

1 5 (very hfg h stringency). 

For short probes which are about 15 nucleotides to about 70 nucleotides in length, 
stringency conditions are defined as prehybridization, hybridization, and washing post- 
hybridization at 5°C to 10*C below the calculated T m using the calculation according to Bolton 
and McCarthy (1962, Proceedings of the National Academy of Sciences USA 48:1390) in 0.9 

20 M Nad, 0.09 M Tris-^CI pH 7.6, 6 mM EDTA, 0.5% NP-40, 1X Denhardfs solution, 1 mM 
sodium pyrophosphate, 1 mM sodium, monobasic phosphate, 0.1 mM ATP, and 0.2 mg of 
yeast RNA per ml following standard Southern blotting procedures. 

For short probes which are about 15 nucleotides to about 70 nucleotides in length, the 
carrier material is washed once in 6X SCC plus 0.1% SDS for 15 minutes^and twice each for 

25 15 minutes using 6X SSC at 5°C to 10*C below the calculated T m . f jf 

Position numbering 

- » ■ 

In the present context, the basi? for numbering positions is SEQ ip NO: 2, Protease 
10, starting with A1 and ending with T188, see Fig. 1. A parent protease, qs well as a variant 
30 protease, may comprise extensions as compared to SEQ ID NO: 2, i.e. in the N-terrninal, 
and/or the C-terminal ends thereof. The amino acids of such extensions, if any, are to be 
numbered as is usual in the art, i.e. for a C-terminal extension: 189, 190, 191 and so forth, and 
for an N-terminal extension -1, -2, -3 and so forth. 

35 Alterations, such as Substitutions, Deletions, Insertions 

In the present context, the following are examples of various ways in which a proteose 
variant can be designed or derived from a parent amino acid sequence: Arj 5 amino acid can be 

11 'ft 
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substituted with another amino acid; ah amino acid can be deleted; an amino acid can be 
inserted; as well as any combination of any number of such alterations. 

For the present purposes, the term substitution is intended to include any number of 
any type of such alterations. This is a reasonable definition, because, for example, a deletion 
5 can be regarded as a substitution of an amino acid, AA, in a given position, nn, with nothing, 
(). Such substitution can be designated: AAnn{). Likewise, an insertion of only one amino acid, 
BB, downstream an amino acid, AA, in a given position, nn, can be designated: (JnnaBB. And 
if two amino acids, BB and CC, are inserted downstream of amino acid AA in position nn, this 
substitution (combination of two substitutions) can be designated: ()nnaBB+()nnbCC, the thus 

10 created gaps between amino acids nn and nn+1 in the parent sequence being assigned lower 

■ * 

case or subscript letters a, b, c etc. to the former position number, |Here nn. A| similar 

» f .. 
numbering procedure is followed when' aligning a new sequence to the multiple alignment of 

■ 

Fig. 1, in case of a gap being created by the alignment between amino-acids nn and nn+1: 
Each position of the gap is assigned a number nna, nnb etc.. A Comma (,) between 
is substituents, as e.g. In the substitution T129E,D.Y,Q means "either or?; i.e. that T129 is 
substituted with E. or p, or Y, or Q. A plus-sign (+) between substitutions, e.g. 129D+13SP 
means "and", i.e. that these two single substitutions are combined in one and the same 
protease variant. 

In the present context, the term "a" substitution" means at least one substitution. At 

20 least one means one or more, e.g. one, or two, or three, or four, or five, or six, or seven, or 
eight, or nine, or ten, or twelve, or fourteen, or fifteen, or sixteen, or eighteen, or twenty, or 
twentytwo or twentyfour, or twentyfive, or twenty eight, or thirty, and so on, to include in 
principle, any number of substitutions. The variants of the invention, however, still have to be, 
e.g., at least 60% identical to SEQ ID NO: 2, this percentage being determined by the above- 

25 mentioned program. The substitutions ican be applied to any position ericompassed.by any 
region mentioned in claim 1, and variants comprising combinations of any>number and type of 
such substitutions are also included. The term substitution as used herein also . include 
deletions, as well as extensions, or insertions, that may add to the lepgjji of the sequence 
corresponding to SEQ ID NO: 2. ^ &j 

30 Furthermore, the term "a substitution" embraces a substitution intojany one of the other 

nineteen natural amino acids, or into other amino acids, such as non-natural amino acids. For 
example, a substitution of amino acid T in position 22 includes each of the following 
substitutions: 22A, 22C, 22D, 22E, 22F, 22G, 22H, 221, 22K, 22L, 22M, 22N, 22P, 22Q, 22R, 
22S, 22V, 22W, and 22Y, This Is, by the way, equivalent to the designation 22X, wherein X 

35 designates any amino acid. These substitutions can also be designated T22A, T22C, T22X, 
etc. The same applies by analogy to each and every position mentioned herein, to specifically 
include herein any one of such substitutions. 

i 12 
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Identifying Corresponding Position Numbers >■ 

* * 

For each amino acid residue in each parent or variant protease, of $ie invention and/or 
for use according to the invention, it is possible to directly and unambiguously assign an amino 
5 acid residue in SEQ ID NO: 2 to which it corresponds. Corresponding residues are assigned 
the same number, by reference to the Protease 10 sequence. 

As it appears from the numbering of Fig. 1, in conjunction with the numbering of the 
sequence listing, for each amino acid residue of each of the proteases Protease 10, Protease 
18, Protease 1 1 , Protease 35, and Protease 08, the corresponding amino add residue in SEQ 
10 ID NO: 2 has the same number. This number is easily derivable from Fig, 1 . At least in case of 
these five proteases, the number is the same as the number assigned to this amino acid 
residue in the sequence listing for the mature part of the respective protease. 

For a given position in another protease - be it a parent or a variant protease - a 
corresponding position of SEQ ID NO: 2 can always be found, as follows: 
15 The amino acid sequence of another parent protease, or, in turn, of { a variant protease 

amino acid sequence, is designated SEQ-X. A position corresponding to position N of SEQ ID 
NO: 2 is found as follows: The parent or variant protease amino acid sequence SEQ-X is 
aligned with SEQ ID NO: 2 as specified above in the section entitled AmjQO Acid Homology. 
From the alignment, the position in sequence SEQ-X corresponding to position N of SEQ ID 
20 NO: 2 can be dearly and unambiguously derived, using the principles described below. 

SEQ-X may be a mature part of the protease in question, or it may also include a 
signal peptide part, or it may be a fragment of the mature protease which has protease 
activity, e.g. a fragment of the same length as SEQ ID NO: 2, and/or it may be the fragment 
which extends from A1 to T188 when aljgned with SEQ ID NO: 2 as described herein. 

25 

» 

Region and Position 

In the present context, the term region means at least one position of a parent 
protease amino add sequence, the term position designating an amino add residue of such 
amino add sequence. In one embodiment, region means one or more successive positions of 

30 the parent protease amino acid sequence, e.g. one, two, three, four,; fiv$, six, seven, eight, 
etc., up to any number of consecutive positions of the sequence. Accordingly, a region may 
consist of one position only, or it may consist of any number of consecutive^positions, such as, 
e.g., position no. 62 and 63; or position.no. 111, 112, 113 and 114. For the present purposes, 
these two regions are designated 62-63, and 111-114, respectively. The boundaries of these 

35 regions or ranges are included in the region. 

A region encompasses specifically each and every positron it embraces. For example, 
region 111-114 specifically encompasses each of the positions 111, 112, 113, and 114. The 
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same applies by analogy for the other regions mentioned herein. 



.r. 



! % 

Thermostability 

For the present purposes, the term thermostable as applied in the ; context of a certain 
5 polypeptide, refers to the melting temperature, Tm, of such polypeptide, as determined using 

Differential Scanning Calorimetry (DSC) in 10mM sodium phosphate, 50 miM sodium chloride, 

" i 

pH 7.0, using a constant scan rate of 1 5°C/min. 

The following Tm's were determined under the above conditions: 76.5'C (Protease 10), 
83.0X (Protease 16), 78.3°C (Protease 08), 76,6°C (Protease 35). 73.7°C (Protease 11). and 

» 

10 83.5°C (Protease 22). 

For a thermostable polypeptide, the Tm is at least 83.1°C. In particular embodiments, 
the Tm is at least 84. 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 or at least 
100°C. 

In the alternative, the term thermostable refers to a melting temperature of at least 
15 73.8, or at least 76.7°C, or at least 78.4°C, preferably at least 74, 75, 7& 77, 78, 79, 80, 81, 
82, or at least 83°C, still as determined using DSC at a pH of 7.0. 

For the determination of Tm, a sample of the polypeptide with a purity of at least 90% 
(or 91, 92, 93, 94, 95, 96, 97, or 98%) as determined by SDS-PAGE may be used. Still further, 
the enzyme sample may have a concentration of between 0.5 and 2.&:mg/ml protein (or 
20 between 0.6 and 2.4, or between 0.7.and 2.2, or between 0.8 and 2.(Hirig/ml protein), as 
determined from absorbance at 280 nrri and based on an extinction coefficient calculated from 
the amino acid sequence of the enzyme in question. 

The DSC takes place at the desired pH (e.g. pH 5.5, 7.0, 3.0, or 2.5) and with a 
constant heating rate, e.g. of 1, 1.5, 2, 3, 4, 5, 6, 7, 8, 9 or 10'C/min. 
25 In a particular embodiment, the protease variant of the invention is more thermostable 

than the parent protease. A preferred parent protease for this purpose is Protease 18. 

Temperature Activity Profile 

In a particular embodiment, the protease variant of the invention exhibits an amended 
30 temperature activity profile as compared to, e.g., Protease 10 (or Proteose 18, Protease 11. 
Protease 35, or Protease 08). For example, the protease variant of the invention may exhibit a 



relative activity at pH 9 and 80°C of at least. 0.40, preferably at least 0.^, 0.50. 0.55, 0.60, 

i -\ 

0.65. 0.70, 0.75, 0.80, 0.85, 0.90, or at feast 0.95, the term "relative" referring to the maximum 

* i • 

activity measured for the protease In question.. For Protease 22, the activity is relative to the 
35 activity at 80°C which is set to 1.000 (100%), and for Protease 10, the activity at 70°C is set to 
1.000 (100%), see Example 2. As another example, the protease variant of the invention 
exhibits a relative activity at pH 9 and|90°C of at least 0.10, preferably at least 0.15, 0.20, 
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0.25. 0.30, or of at least 0.35. In a particular embodiment, the protease activity is measured 
using the Protazyme AK assay of Example 1 . ^ 

: % * 

Low-allergenic Variants j 
5 In a specific embodiment, the protease variants of the present invepjion are (also) low- 

allergenic variants, designed to Invoke a reduced immunological respons^ when exposed to 
animals, including man. The term immunological response is to be understood as any reaction 
by the immune system of an animal exposed to the protease variant. One type of 
immunological response Is an allergic response leading to increased levels of IgE in the 

10 exposed animal. Low-allergenic variants may be prepared using techniques known in the art. 
For example the protease variant may be conjugated with polymer moieties shielding portions 
or epitopes of the protease variant involved in an immunological response. Conjugation with 
polymers may involve In vitro chemical coupling of polymer to the protease variant, e.g. as 
described in WO 96/17929. WO 98/30682, WO 98/35026, and/or WO 99/00489. Conjugation 

15 may in addition or alternatively thereto involve in vivo coupling of polymers to the protease 
variant Such conjugation may be achieved by genetic engineering of the nucleotide sequence 
encoding the protease variant, inserting consensus sequences encoding additional 
glycosylation sites in the protease variant and expressing the protease variant in a host 
capable of glycosylating the protease variant, see e.g. WO 00/26354, Another way of 

20 providing low-allergenic variants is genetic engineering of the nucleotide jspquence encoding 
the protease variant so as to cause the protease variants to self-oligbm§rize, effecting that 
protease variant monomers may shield ithe epitopes of other protease varfent monomers and 
thereby lowering the antigenicity of the oligomers. Such products and '{heir preparation Is 
described e.g. in WO 96/16177, Epitopes Involved in an immunological , response may be 

• * 

25 identified by various methods such as the phage display method described in WO 00/26230 
and WO 01/83559, or the random approach described in EP 561907, Once an epitope has 
been identified, its amino acid sequence may be altered to produce altered Immunological 
properties of the protease variant by known gene manipulation techniques such as site 
directed mutagenesis (see e.g. WO 0Q/26230, WO 00/26354 and/or WO 00/22103) and/or 

30 conjugation of a polymer may be done in sufficient proximity to the epitope for the polymer to 
shield the epitope. ; 

< ■ -I: 

Nucleic Acid Sequences and Constructs 

The present invention also relates to nucleic acid sequences comprising a nucleic acid 
35 sequence which encodes a protease variant of the invention. ; L 

The term "isolated nucleic acid sequence" refers to a nucleic acid^sequence which is 

essentially free of other nucleic acid sequences, e.g., at least about 20%ilpure, preferably at 

M 
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least about 40% pure, more preferably at least about 60% pure, even more preferably at least 
about 80% pure, and most preferably, at least about 90% pure as determined by agarose 
electrophoresis. For example, an Isolated nucleic acid sequence can be obtained by standard 
cloning procedures used in genetic engineering to relocate the nucleic acid sequence from its 
5 natural location to a different site where it will be reproduced. The cloning procedures may 
involve excision and Isolation of a desired nucleic acid fragment comprising the nucleic acid 
sequence encoding the polypeptide, insertion of the fragment into a vector molecule, and 
incorporation of the recombinant vectoriinto a host cell where multiple copjes or clones of the 
nucleic acid sequence will be replicated. The, nucleic acid sequence may be of genomic, 

10 cDNA, RNA, semisynthetic, synthetic origin, or any combinations thereof. If i i 

The nucleic acid sequences of the invention can be prepared by Introducing at least 
one mutation into the parent protease coding sequence or a subsequence thereof, wherein the 
mutant nucleic acid sequence encodes a variant protease. The introduction of a mutation into 
the nucleic acid sequence to exchange one nucleotide for another nucleotide may be 

15 accomplished by site-directed mutagenesis using any of the methods known in the art, e.g. by 
site-directed mutagenesis, by random mutagenesis, or by doped, spiked, or localized random 
mutagenesis. 

Random mutagenesis is suitably performed either as localized or region-specific 
random mutagenesis in at least three parts of the gene translating to the amino acid sequence 
20 shown in question, or within the whole gene. When the mutagenesis is petformed by the use 
of an oligonucleotide, the oligonucleotide may be doped or spiked with the three non-parent 

nucleotides during the synthesis of the oligonucleotide at the positions which ar$ to be 

■ • >- 

changed. The doping or spiking may be performed so that codons for .urivj/janted amino acids 

. . t 

are avoided. The doped or spiked oligonucleotide can be incorporated int^the DNA encoding 
25 the protease enzyme by any technique, using, e.g., PCR, LCR or any D^lA polymerase and 
ligase as deemed appropriate. \ :£) 

Preferably, the doping is carried out using "constant random doping", in which the 
percentage of wild-type and mutation in each position is predefined. Furthermore, the doping 
may be directed toward a preference for the introduction of certain nucleotides, and thereby a 
30 preference for the introduction of one or more specific amino acid residues. The doping may 
be made, e.g., so as to allow for the introduction of 90% wild type and 10% mutations in each 
position. An additional consideration in the choice of a doping scheme is based on genetic as 
well as protein-structural constraints. 

The random mutagenesis may be advantageously localized to a part of the parent 
35 protease in question. This may, e.g., be advantageous when certain regiqns of the enzyme 
have been identified to be of particular importance for a given property of tlp£ enzyme. ; 

Alternative methods for providing variants of the invention incfude^cjene shuffljng e.g. 

M 
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as described in WO 95/22625 or in WO 96/00343, and the consensus derivation process as 
described in EP 897985 (see the section "Parent Protease 1 ' for more details). 

The present invention does not relate to the following nucleic acid sequences: 

(i) Nucleotides 900-1466 of SEQ ID NO: 1, nucleotides 499-1062 of SEQ ID NO: 3 V 
5 nucleotides 496-1059 of SEQ ID NO: 5, nucleotides 496-1059 of SEQ ID NO: 7, and 

nucleotides 502-1065 of SEQ ID NO: 9; 

(ii) the nucleic acid sequence encoding the mature peptide part of the protease derived 
from Nocardiopsis dassonvWei NRRL 18133, in the event that this protease has at least 60% 
identity to SEQ ID NO: 2\ ' ^ 

10 (iii) the nucleic acid sequence encoding the mature peptide part of We protease^derived 

from Nocardiopsis sp. FERM P-10508, in the event that this protease has*at least 60% identity 
to SEQ ID NO: 2; 

9 * 

(iv ) the nucleic acid sequence encoding the mature peptide part of the protease 
derived from Nocardiopsis dassonvillei strain ZIWIET 43647; and/or 1 ; 

is (v) nucleic acid sequences encoding any prior art proteases of at least 60% identity to 

SEQ ID NO: 2. 

Nucleic Acid Constructs 

A nucleic acid construct comprises a nucleic acid sequence of the present invention 
zo operably linked to one or more control sequences which direct the expression of the coding 
sequence in a suitable host cell under conditions compatible with the control sequences. 
Expression will be understood to include any step involved in the production of the polypeptide 
including, but not limited to, transcription, post-transcriptional modificatio/1, translation, post- 
translationat modification, and secretion. r- 

Expression vector i % 

A nucleic acid sequence encoding a protease variant of the invention can be 
expressed using an expression vector which typically includes control sequences encoding a 
promoter, operator, ribosome binding, site, translation initiation signal,, and, optionally, a 
30 repressor gene or various activator genes. ■ ... 

The recombinant expression vector carrying the DNA sequence encoding a protease 
variant of the invention may be any vector which may conveniently be subjected to 
recombinant DNA procedures, and the choice of vector will often depend on the host cell into 
which it is to be introduced. The vector may be one which, when introduced into a host cell, is 
35 integrated into the host cell genome and replicated together with the chromosome(s) into 
which it has been integrated. 

The protease variant may also be co-expressed together with at least one other 
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enzyme of animal feed interest, such- as an alpha-amylase, a phytase, a galactanase, a 
xylanase, an endoglucanase, an endo-),3(4)-beta-glucanase, an aipha-galactosidase, and/or 
a protease* The enzymes may be co-expressed from different vectors, from one vector, or 
using a mixture of both techniques. When using different vectors, the vectors may have 

5 different selectable markers, and different origins of replication. When using only one vector, 
the genes can be expressed from one. or more promoters. If cloned under the regulation of 
one promoter {di- or multi-cistronic), the order in which the genes are cloned may affect the 
expression levels of the proteins. The protease variant may also be expressed as a fusion 
protein, i.e. that the gene encoding theiprotease variant has been fused in frame to the gene 

10 encoding another protein. This protein may be. another enzyme or a funptional domain from 
another enzyme. M 

* 

* » 

Host Cells i 

<% ■ 

The present invention also relates to recombinant host cells, comprising a nucleic acid 
t5 sequence of the invention, which are advantageously used in the recombinant production of 
the polypeptides. A vector comprising a nucleic acid sequence of the present invention is 
introduced into a host cell so that the vector is maintained as a chromosomal integrant or as a 
self-replicating extra-chromosomal vector. The term "host celT encompasses any progeny of a 
parent cell that is not identical to the parent cell due to mutations that occur during replication. 
20 The choice of a host ceil will to a large extent depend upon the gene encoding the polypeptide 
and its source. 

The host ceil may be a unicellular microorganism, e.g., a prokaryote, or a non- 
unicellular microorganism, e.g., a eukaryote cell, such as an animal, a mammalian, an insect, 
a plant, or a fungal cell. Preferred animal cells are non-human animal cells. 
25 In a preferred embodiment, the host cell is a fungal cell, or a yeast cell such as a 

Candida. Hansenula, Kluyveromyces, Pichia t Saccharomyces, Schizosaccharomyces, or 

Yarrowia cell. The fungal host cell may be a filamentous fungal cell, such as a cell of a species 

•» *> 

of, but not limited to, Acremonium, Aspergillus, Fusarium, Humicola, Mucor, Myceliophthora, 
Neurospora, Penicillium, Th/e/av/a, Tolypocladium, or Trichoderma. Usefu( : unicellular cells are 

30 bacterial cells such as gram positive bapteria including, but not limited to,j£ Bacillus cell, e.g., 
Bacillus alkalophilus. Bacillus amyloliquefeciens. Bacillus brevis, Bacillufycirculans, Bacillus 
clausii. Bacillus coagulans. Bacillus lautus. Bacillus lentus, Bacillus licheniformis. Bacillus 
megaterium. Bacillus stearothermophitus t Bacillus subtilis, and Bacillus fiuringiensis, or a 
Streptomyces cell, such as Streptomyces IMdans or Streptomyces murinus, or a Nocardiopsi? 

35 cell, or cells of lactic acid bacteria; or gram negative bacteria such as £. coli and 
Pseudomonas sp. Lactic acid bacteria include, but are not limited to, species of the genera 
Lactococcus, Lactobacillus, Leuconostoc, Streptococcus, Pediococcus, and Enterococcus. 
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Methods of Production 

The present invention also relates to methods for producing a protease variant of the 
present invention comprising (a) cultivating a host cell under conditions conducive for 
5 production of the protease variant; and (b) recovering the protease variant, j 

In the production methods of the present invention, the cells are cultivated in a nutrient 
medium suitable for production of the polypeptide using methods known in the art. For 
example, the cell may be cultivated by shake flask cultivation, small-scale or large-scale 
fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory 
10 or industrial fermentors performed in a suitable medium and under conditions allowing the 
polypeptide to be expressed and/or isolated. The cultivation takes place in a suitable nutrient 
medium comprising carbon and nitrogen sources and inorganic salts, using procedures known 
in the art Suitable media are available from commercial suppliers or may be prepared 
according to published compositions (e.g. ( in catalogues of the American Type Culture 
15 Collection). If the protease is secreted into the nutrient medium, it can be recovered directly 
from the medium. If it is not secreted, it can be recovered from cell lysates. 

The resulting protease may be recovered by methods known in the art. For example, it 
can be recovered from the nutrient medium by conventional procedure^! including, but not 
limited to, centrifugation, filtration, extraction, spray-drying, evaporation, odprecipitatlon. 
20 The proteases of the present invention may be purified by a variety of procedures 

known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, 
hydrophobic, chromatofocusing, and size exclusion), electrophoretlc procedures (e.g., 
preparative isoelectric focusing), differential solubility (e.g., ammonium sulfate precipitation), 
SDS-PAGE, or extraction (see, e.g., Protein Purification, J.-C. Janson and Lars Ryden, 
25 editors, VCH Publishers, New York, 1989). 

Plants 

The present invention also relates to a transgenic plant, plant part, or plant cell which 
has been transformed with a nucieic add sequence encoding a polypeptide having protease 
30 activity of the present invention so as to express and produce the polypeptide in recoverable 
quantities. The polypeptide may be recovered from the plant or plant pgrt. Alternatively, the 
plant or plant part containing the recombinant polypeptide may be used a? -such for improving 
the quality of a food or feed, e.g., improving nutritional value, palatably, and rheplogical 
properties, or to destroy an antinutritive factor. % 

, ► 

35 In a particular embodiment, \l)e polypeptide is targeted to the ejndosperm storage 

vacuoles in seeds. This can be obtained by synthesizing it as a precursor with a suitable signal 
peptide, see Horvath et al in PNAS, Feb. 15, 2000, vol. 97, no. 4, p. 1 914-1919. 
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The transgenic plant can be, dicotyledonous (a dicot) or monocotyledonous (a 
monocot) or engineered variants thereof. Examples of monocot plants arej grasses, such as 
meadow grass (blue grass. Poa), forape grass such as Festuca, Loliurp^ temperate, grass, 
such as Agrostis, and cereals, eg., whe^t, oats, rye, barley, rice, sorghum^and maize (pom). 
5 Examples of dicot plants are tpbacco, legumes, such as lupins, potato, sugar beet, 

pea, bean and soybean, and cruciferous plants (family Brassicaceae). ^uch as cauliflower, 
rape seed, and the closely related model organism Arabidopsis /rta//ana.;|Low-phytate plants 
as described e.g. in US patent no. 5,689,054 and US patent no. 6,111,168 are examples of 
engineered plants. Examples of plant parts are stem, callus, leaves, root., fruits, seeds, and 

10 tubers, as well as the individual tissues comprising these parts, e.g. epidermis, mesophyll, 
parenchyma, vascular tissues, meristems. Also specific plant cell compartments, such as 
chloroplast. apoplast, mitochondria, vacuole, peroxisomes, and cytoplasm are considered to 
be a plant part. Furthermore, any plant cell, whatever the tissue origin, is considered to be a 
plant part. Likewise, plant parts such as specific tissues and cells isolated to facilitate the 

15 utilisation of the invention are also considered plant parts, e.g. embryos, endosperms, 
aleurone and seed coats. 

* : 
Also included within the scope of the present invention are the progeny of such 

plants, plant parts and plant cells. n 

The transgenic plant or plant cell expressing a polypeptide of tlje present invention 

20 may be constructed in accordance with methods known in the art. Briefly, ttye plant or plant cell 

is constructed by incorporating one or more expression constructs encoc|jrjg a polypeptide of 

the present invention into the plant host genome and propagating the resulting modified plant 

or plant cell into a transgenic plant or plant ceil. 

i ' 1 

Conveniently, the expression construct is a nucleic acid construct which comprises a 
25 nucleic add sequence encoding a polypeptide of the present invention operably linked with 
appropriate regulatory sequences required for expression of the nucleic acid sequence in the 
plant or plant part of choice. Furthermore, the expression construct may comprise a selectable 
marker useful for identifying host cells into which the expression construct has been integrated 
and DNA sequences necessary for introduction of the construct into the plant in question (the 
30 latter depends on the QNA introduction method to be used). 

The choice of regulatory sequences, such as promoter and terminator sequences and 
optionally signal or transit sequences are determined, for example, on .the basis of when, 
where, and how the polypeptide Is desired to be expressed. For Instance; the expression of 
the gene encoding a polypeptide of the present invention may be constitutive or inducible, or 



• * 



35 may be developmental, stage or tissue; specific, and the gene product m^y be targeted to a 
specific tissue or plant part such as seeds or leaves. Regulatory sequences are, for example, 
described by Tague et a!., 1988, Plant physiology 86: 506. 
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For constitutive expression, the following promoters may be used: The 35S-CaMV 
promoter (Franck et fL, 1980, Cell 21: 285-294), the maize ubiquitin 1 (Christensen AH, 
Sharrock RA and Quail 1992. Maize polyubiquitin genes: structure, thermal perturbation of 
expression and transcript splicing, and promoter activity following transfer to protoplasts by 
5 electroporation), or the rice actin 1 promoter (Plant Mo. Biol. 18, 675-689.;: Zhang W, lyicElroy 
D. and Wu R 1991, Analysis of rice Act1 5' region activity in transgenic ri^e plants. Plant Cell 
3. 1155-1165). Organ-specific promoters may be, for example, a promotejij from storage sink 
tissues such as seeds, potato tubers, and fruits (Edwards & Coruzzi. 1.99jDj Ann. Rev. Genet. 
24: 275-303), or from metabolic sink tijssues such as meristems (Ito et aU 1994, Plant MoL 

10 Biol. 24: 863-878), a seed specific prompter such as the glutelin, prolamin, (globulin, or albumin 
promoter from rice (Wu et al„ 1998. Plant and Cell Physiology 39: 885-889), a Vlcia faba 
promoter from the legumin B4 and the unknown seed protein gene from Vicia faba (Conrad et 
aL, 1998, Journal of Plant Physiology 152: 708-711), a promoter from a seed oil body protein 
(Chen et a!., 1998, Plant and Cell Physiology 39: 935-941), the storage protein napA promoter 

15 from Brassica napus, or any other seed specific promoter known in the art, e.g., as described 
in WO 91/14772. Furthermore, the promoter may be a leaf specific promoter such as the rbcs 
promoter from rice or tomato (Kyozuka et at, 1993, Plant Physiology 102: 991-1000, the 
chlorella virus adenine methyltransferase gene promoter (Mitra and Higgins, 1994, Plant 
Molecular Biology 26: 85-93), or the aldP gene promoter from rice (Kalgaya et aL, 1995, 

20 Molecular and General Genetics 248: 668-674), or a wound inducible promoter such as the 
potato pln2 promoter (Xu et aL, 1993j Plant Molecular Biology 22: 573j&88). Likewise, the 
promoter may be inducible by abiotic treatments such as temperature, drought or alterations in 
salinity or inducible by exogenousiy applied substances that activatejjthe promoter, e.g. 
ethanol, ©estrogens, plant hormones like ethylene, abscisic acid, gibberelllc acid, and/qr heavy 

25 metals. \ y\ 

A promoter enhancer element jnay also be used to achieve higher expression of the 
enzyme in the plant. For instance, the promoter enhancer element may be an intron which is 
placed between the promoter and the nucleotide sequence encoding a polypeptide of the 
present invention. For instance, Xu et al., 1993, supra disclose the use of the first intron of the 

30 rice actin 1 gene to enhance expression. 

Still further, the codon usage may be optimized for the plant species in question to 
improve expression {see Horvath et al referred to above). 

The selectable marker gene and any other parts of the expression construct may be 
chosen from those available in the art. < 

35 The nucleic acid construct is incorporated into the plant genome according to 

conventional techniques known in the jart, including Agrobacterium-rnedl4pd transformation, 

virus-mediated transformation, microinjection, particle bombardment, blojjstic transformation, 

;?! 
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and electroporation (Gasser et al M 1990, Science 244: 1293; Potrykus, 1990, Bio/Technology 
8: 535; Shimamoto el al.. 1989, Nature ?3B: 274). 

Presently, Agrobacterlum tumefaciens-mediated gene transfer is the method of 
choice for generating transgenic dicots (for a review, see Hooykas and Schilperoort, 1992, 
5 Ptant Molecular Biology 19: 15-38), and it can also be used for transforming monocots, 
although other transformation methods ,are generally preferred for these plants. Presently, the 
method of choice for generating transgenic monocots, supplementing, Jhe Agrobacterium 
approach, is particle bombardment (microscopic gold or tungsten particles coated with the 
transforming DNA) of embryonic calli o^ developing embryos (Christou; 1992. Plant Journal 2: 
10 275-281; Shimamoto, 1994, Current Opinion Biotechnology 5: 158-1 62j;iVasll et al,, 1992, 
Bio/Technology 10: 667-674). An alternative method for transformation o^imonocots is based 
on protoplast transformation as described by Omirulleh et al., 1993, Plant Molecular Biology 
21:415-428. 

Following transformation, the transformants having incorporated therein the 
15 expression construct are selected and regenerated into whole plants according to methods 
well-known in the art. 

The present invention also relates to methods for producing a polypeptide of the 
present invention comprising (a) cultivating a transgenic plant or a plant cell comprising a 
nucleic acid sequence encoding a protease variant of the present invention under, conditions 
20 conducive for production of the protease variant; and (b) recovering the protease variant. 

Animals as Expression Hosts 

The present invention also relates to a transgenic, non-human animal and products or 
elements thereof, examples of which are body fluids such as milk and tilpod, organs, flesh, 

25 and animal cells. Techniques for expressing proteins, e.g. in mammalian •! cells, are known in 
the art, see e.g. the handbook Protein Expression: A Practical Approach.jHiggins and Hames 
(eds), Oxford University Press (1999), and the three other handbooks in tipis series reiating to 
Gene Transcription, RNA processing, and Post-translational Processing. Generally speaking, 
to prepare a transgenic animal, selected cells of a selected animal are transformed with a 

30 nucleic add sequence encoding a protease variant of the present invention so as to express 
and produce the protease variant. The protease variant may be recovered from the animal, 
e.g, from the milk of female animals, or it may be expressed to the benefit of the animal itself, 
e.g. to assist the animal's digestion. Examples of animals are mentioned below in the section 
headed Animal Feed and Animal Feed Additives. 

35 To produce a transgenic animal ;with a view to recovering the protease variant from the 

milk of the animal, a gene encoding the protease variant may be inserted into the fertilized 
eggs of an animal in question, e.g. by use of a transgene expression vector which comprises a 
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suitable milk protein promoter, and the gene encoding the protease variant. The transgene 
expression vector is microfnjected Into fertilized eggs, and preferably permanently integrated 
into the chromosome. Once the egg begins to grow and divide, the potential embryo is 
implanted into a surrogate mother, and animals carrying the transgene are identified. The 
5 resulting animal can then be multiplied by conventional breeding. The protease variant may be 
purified from the animal's milk, see e.g. Meade, KM. et al (1999): Expression of recombinant 
proteins in the milk of transgenic animdls, Gene expression systems: Using nature for the art 
of expression. J. M. Fernandez and J. P. Hoeffler (eds.), Academic Press. 

In the alternative, in order to produce a transgenic non-human animal that carries in 
10 the genome of its somatic and/or germ cells a nucleic acid sequence including a heterologous 
transgene construct including a transgene encoding the protease variantjfthe transgene may 
be operably linked to a first regulatory sequence for salivary gland specific expression of the 
protease variant, as disclosed in WO 2000064247. 

■ r . 

15 Animal Feed and Animal Feed Additives 

For the preserit purposes, the^ term animal includes all animals! including human 
beings. In a particular embodiment, the protease variants and compositions of the invention 
can be used as a feed additive for horwhuman animals. Examples of animals are non- 
ruminants, and ruminants, such as sheep, goats, horses, and cattle, e.g. beef cattle, cows. 

20 and young calves. In a particular embodiment, the animal is a non-ruminant animal. Non- 
ruminant animals include mono-gastric animals, e.g. pigs or swine (including, but not limited 
to, piglets, growing pigs, and sows); poultry such as turkeys, ducks and chicken (including but 
not limited to broiler chibks, layers); young calves; and fish (including but not limited to salmon, 
trout, tilapia, catfish and carps; and crustaceans (including but not limited to shrimps and 

25 prawns). £ 

The term feed or feed composition means any compound, preparation, mixture, or 
composition suitable for, or intended for intake by an animal. The feecjfcan be fed to the 
animal before, after, or simultaneously with the diet. The latter is preferred: • 

The composition of the Invention, when intended for addition to driimal feed, may be 

30 designated an animal feed additive. Such additive always comprises the' protease variant in 
question, preferably nr* the form of stabilized liquid or dry compositions. The additive may 
comprise other components or ingredients of animal feed. The so-called pre-mixes for animal 
feed are particular examples of such animal feed additives. Pre-mixes may contain the 
enzyme(s) in question, and in addition at least one vitamin and/or at least one mineral. 

35 Accordingly, in a particular embodiment, in addition to the component polypeptides, the 

composition of the invention may comprise or contain at least one fat-soluble vitamin, and/or 
at least one water-soluble vitamin, and/pr at least one trace mineral. Also at least one macro 
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mineral may be included. |£ 

Examples of fat-soluble vitamins are vitamin A f vitamin D3, vitamia E, and vitamin K, 
e.g. vitamin K3. 

Examples of water-soluble vitarnins are vitamin B12, biotin and choline, vitamin B1, 
5 vitamin B2, vitamin B6, niacin, folic acid and pantothenate, e.g. Ca-D-panthothenate. 

Examples of trace minerals are manganese, zinc, iron, copper, iodine, selenium, and 

cobalt. 

Examples of macro minerals are calcium, phosphorus and sodium. 
Further, optional, feed-additive ingredients are colouring agents, aroma compounds, 
10 stabilizers, additional enzymes, and antimicrobial peptides. 

Additional enzyme components of the composition of the invention include at least one 
polypeptide having alpha-amytase activity, and/or at least one polypeptide having xylanase 
activity, and/or at least one polypeptide having endoglucanase activity; and/or at le,ast one 
polypeptide having endo-1 ,3(4}-beta-glucanase activity; and/or at least one. polypeptide having 
15 phytase activity; and/or at least one polypeptide having galactanase activity; and/or at least 
one polypeptide having alpha-galactosidase activity. 

Alpha-amylase activity can be qieasured as is known in the art, ;e ? g. using a starch- 
based substrate. 

Xylanase activity can be measured using any assay, in which a substrate is employed, 
20 that includes 1 i4-beta-D-xyIosidic endOrlinkages in xylans. Different types of substrates are 
available for the determination of xylanase activity e.g. Xylazyme cross-linked arabinoxylan 
tablets (from MegaZyme), or insoluble powder dispersions and solutions of azo-dyed 
arabinoxylan. 

Endoglucanase activity can be determined using any endoglucanase assay known in 
25 the art. For example, various cellulose- or beta-glucan-containing substrates can be applied. 
An endoglucanase assay may use AZCL-Bartey beta-Glucan, or preferably (1) AZCL-HE- 
Cellulose, or (2) Azo-CM-celluIose as. a substrate. In both cases, the : -degradation of the 
substrate Is followed spectrophotometrtcally at OD g9 5 (see the Megazym^, method for AZCL- 
polysaccharides for the assay of endo-hydrolases at http:/Avww-niepazyme.com/book^ 
30 iets/AZCLPOLpdf. I hi 

Endo-1 ,3(4)-beta-glucanase activity can be determined using any endo-1 ,3(4)-beta- 
glucanase assay known in the art. preferred substrate for endo-1 $(4)-beta-gJucanase 
activity measurements is a cross-linked azo-coloured beta-glucan Barley substrate, wherein 
the measurements are based on spectrophotometry determination principles. 
35 Phytase activity can be measured using any suitable assay, e.g. the FYT assay 

described In Example 4 of WO 98/28408. 

Qalactanase can be assayed e.g. with AZCL gatactan from Megazyme, and alpha- 

24 
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galactosidase can be assayed e.g. withpNP-alpha-galactoside. : r. [ 

For assaying these enzyme actjvitites the assay-pH and the assay-temperature are to 

« 

be adapted to the enzyme in question (preferably a pH close to the optimum pH, and a 
temperature close to the optimum temperature). A preferred assay pH is in the range of 2-10, 
s preferably 3-9, more preferably pH 3 or 4 or 5 or 6 or 7 or 8, for example pH 3 or pH 7. A 
preferred assay temperature is in the range of 20-90"C. preferably 30*90"C, more preferably 
40-80°C, even more preferably 40-70°C, preferably 40 or 45 or 50°C. The enzyme activity is 
defined by reference to appropriate blinds, e.g. a buffer blind. 

Examples of antimicrobial peptides (AMP's) are CAP18, Leucocin A, Tritrpticin, Pro- 
10 tegrin-1, Thanatin, Defensin, Lactoferrin, Lactoferricin, and Ovispirin such as Novispirin 
(Robert Lehrer, 2000), Plectasins, and Statins, including the compounds and polypeptides 
disclosed in WO 03/044049 and WO 03/048148. as well as variants or fragments of the above 
that retain antimicrobial activity. 

Examples of antifungal polypeptides (AFP's) are the Aspergillus giganteus, and As- 
15 pergillus nfger peptides, as well as variants and fragments thereof which retain antifungal ac- 
tivity. as disclosed in WO 94/01459 and WO 02/090384. ; ) J 

Usatly fat and water soluble vitahnins, as well as trace minerals forrh? part of a so-called 
premix intended for addition to the feed; whereas macro minerals are usuajly separately added 
to the feed. A premix enriched with a protease of the invention, is an example of an animal 
20 feed additive of the invention. \ \fc 

in a particular embodiment, the animal feed additive of the invention is intended for 
being included (or prescribed as having 1 to be included) in animal diets or feed at levels of 0.O1 
to 10.0%; more particularly 0.05 to 5,0%; or 0.2 to 1.0% (% meaning g additive per 100 g 
feed). This is so in particular for pre mixes. 
25 The nutritional requirements of these components (exemplified with poultry and 

piglets/pigs) are listed in Table A of WO 01/58275. Nutritional requirement means that these 
components should be provided in the diet in the concentrations indicated. 

In the alternatiye, the animal feed additive of the invention comprises at least one of 
the individual components specified in Table A of WO 01/58275. At least one means either of, 
30 one or more of, one, or two, or three, or four and so forth up to all thirteen^ or up to all fifteen 
individual components. More specifically, this at least one individual component is included in 
the additive of the invention in such an amount as to provide an in-feed|ppncentration within 
the range indicated in column four, or column five, or column six of Table A., 

The present invention also relates to animal feed composjtipns. Animal feed 
35 compositions or diets have a relatively, high content of protein. Poultry and pig diets can be 
characterised as indicated in Table B of WO 01/58275, columns 2-3., , Fish diets can be 

characterised as indicated in column 4 of this Table B. Furthermore such fish diets usually 

» 

' 25 



1 0508.01 0-DK 

have a crude fat content of 200-310 g/kg. WO 01/58275 corresponds to US 09/779334 which 
is hereby incorporated by reference. 

• * 

An animal feed composition according to the invention has a cruda protein content of 

.** 

50-800 g/kg. and furthermore comprises at least one protease variant as claimed herein. 
5 Furthermore, or in the alternative (to the crude protein content irjjSicated above), the 

animal feed composition of the invention has a content of metabolisabje energy of 10-30 
MJ/kg; and/or a content of calcium of 0.1-200 g/kg; and/or a content of ^liable phosphorus 
of 0.1-200 g/kg; and/or a content of methionine of 0.1-100 g/kg; apd/or a content of 
methionine plus cysteine of 0,1-150 g/kg; and/or a content of lysine of 0.5-50 g/kg. 

io In particular embodiments, the cpntent of metabolisable energy, crude protein, calcium, 

phosphorus, methionine, methionine plus cysteine, and/or lysine is within any one of ranges 2, 
3, 4 or 5 in Table B of WO 01/58275 (R. 2-5). 

Crude protein is calculated as nitrogen (N) multiplied by a factor 6.25, le. Crude 
protein (g/kg)= N (g/kg) x 6.25. The nitrogen content is determined by the Kjeldahl method 

15 (A.O.A.C., 1984, Official Methods of Analysis 14th ed. f Association of Official Analytical 
Chemists, Washington DC). 

Metabolisable energy can be calculated on the basis of the NRC publication Nutrient 
requirements in swine, ninth revised ediiion 1988, subcommittee on swine nutrition, committee 
on animal nutrition, board of agriculture, national research council. National Academy Press, 

20 Washington, D.C., pp. 2-6, and the European Table of Energy Values for, poultry Feed-stuffs, 
Spelderholt centre for poultry research and extension, 7361 pAjjPeekbergen, The 
Netherlands. Grafisch bedrijf Ponsen & iooijen bv, Wageningen. ISBN 90-71463-12-5. . 

\ 

The dietary content of calcium, available phosphorus and amina;!acids in complete 
animal diets is calculated on the basis cjf feed tables such as Veevoedertatiel 1997, gegevens 
25 over chemlsche samenstelling, verteerbaarheid en voederwaarde vaq voedermiddelen. 
Central Veevoederbureau, Runtferweg 8219 pk Lelystad. ISBN 90-72839-13-7. 

In a particular embodiment, the animal feed composition of the invention contains at 
least one vegetable protein or protein source as defined above. 

In still further particular embodiments, the animal feed composition of the invention 
30 contains 0-80% maize; and/or 0-80% sorghum; and/or 0-70% wheat; and/or 0-70% Barley; 
and/or 0-30% oats; and/or 0-40% soybean meal; and/or 0-10% fish meal; and/or 0-20% whey. 
Animal diets can e.g. be manufactured as mash feed (non pelleted) or pelleted feed. Typically, 
the milled feed-stuffs are mixed and sufficient amounts of essential vitamins and minerals are 
added according to the specifications for the species in question. Enzymes can be added as 

35 solid or liquid enzyme, formulations. For .example, a solid enzyme formulation is typically added 

i J ft 5 

before or during the mixing step; and a; liquid enzyme preparation is typically added after the 

• ■ \\\ 

pelleting step. The enzyme may also be -incorporated in a feed additive or premix. 

:• ■' ' |8 
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The final enzyme concentration in the diet is within the range of 0,01-200 mg enzyme 
protein per kg diet, for example in the range of 0.5-25 mg enzyme protein per kg animal diet. 

The protease variant should of course be applied in an effective amount, i.e. in an 
amount adequate for improving solubilization and/or improving nutritional value of feed. It is at 
5 present contemplated that the enzyme is administered in one or more of th& following amounts 
(dosage ranges): 0.01-200; 0.01-100; 6.5-100; 1-50; 5-100; 10-100; 0.05^0; or 0.10-10 - all 
these ranges being in mg protease enzyme protein per kg feed (ppm). f j" ' 

For determining mg enzyme protein per kg feed, the protease is purified from the feed 
composition, and the specific activity of the purified protease is determined using a relevant 
io assay (see under protease activity, substrates, and assays). The proteas^ .activity of the feed 
composition as such is also determined using the same assay, and on thS basis of these two 
determinations, the dosage in mg enzyme protein per kg feed is calculated: 

The same principles apply for determining mg enzyme protein in feed additives. Of 
course, if a sample is available of the protease used for preparing the feed additive or the 
15 feed, the specific activity is determined from this sample (no need to purify the protease from 
the feed composition or the additive). 

» 

Detergent Compositions 

The protease variant of the invention may be added to and thus become a component 
20 of a detergent composition. j \y 

The detergent composition of the invention may for example be foifrnulated as a hand 
or machine laundry detergent composition including a laundry additive composition suitable for 

pre-treatment of stained fabrics and ; a rinse added fabric softener j^pmposition,- or be 

«? » 

formulated as a detergent composition for use in general household ha|:d surface cleaning 
25 operations, or be formulated for hand or machine dishwashing operations.^' 

In a specific aspect, the invention provides a detergent additjve comprising the 
protease variant of the invention. The detergent additive as well as the detergent composition 
may comprise one or more other enzymes such as another protease, such as alkaline 
proteases from Bacillus, a lipase, a cutinase, an amylase, a carbohydrase, a cellulase, a 
30 pectinase, a mannanase, an arabinase, a galactanase, a xylanase, an oxidase, e.g., a 
laccase, and/or a peroxidase. 

In general the properties of the chosen enzyme(s) should be compatible with the 
selected detergent, (i.e. pH-optimum, compatibility with other enzymatic and non-enzymatic 
ingredients, etc.), and the enzyme(s) should be present in effective amounts. 
35 Suitable lipases include those ;of bacterial or fungal origin. Chemically modified or 

protein engineered mutants are included. Examples of useful lipases ftplude lipases from 
Humicota (synonym TAermomyces), e.g. from H. lanuginosa (T. lanugfno^ t us) as described in 
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EP 258068 and EP 305216 or from H. insolens as described in WO 96/13580, a 
Pseudomonas lipase. e.g. from P. atcaligenes or P. pseudoaicaligenes (EP 218272), P. 
cepac/a (EP 331376). P. sfuteerf (G8 1.372.034), P. fluorescens, Pseudomonas sp. strain SD 
705 (WO 95/06720 and WO 96/27002), P. wisconsinensis (WO 96/12012), a Sac/ffus lipase, 
5 e.g. from fl. subtitis (Dartois et al. (1993), Biochemica et Biophysica Acta, 1131. 253-360), B. 
stearothermophilus (JP 64/744992) or B. pumilus (WO 91/16422). Other examples are lipase 
variants such as those described in WO 92/05249, WO 94/01541, EP 407225. EP 260105. 
WO 95/35381. WO 96/00292, WO 95/30744, WO 94/25578, WO 95/14783, WO 95/22615, 
WO 97/04079 and WO 97/07202. Preferred commercially available lipas£ enzymes include 

10 UpolaseTM and UpoJase UUraTM (Novozymes A/S). ; i& f 

Suitable amylases (alpha- and/or bela-) include those of bacterial or fungal origin. 
Chemically modified or protein engineered mutants are included. Aclases include, for 
example, alpha-amylases obtained from Bacillus, e.g, a special strain ; : of B. Hcheniformis. 
described in more detail in GB 1,296,839. Examples of useful amylases are the variants 

15 described in WO 94/02597, WO 94/18314, WO 95/26397, WO 96/23873. WO 97/43424 t WO 
00/60060, and WO 01/66712, especially the variants with substitutions in one or more of the 
following positions: 15, 23 t 105, 106, 124. 128. 133, 154, 156, 181, 188, 190, 197, 202, 208, 
209, 243, 264, 304, 305, 391, 408, and .444. Commercially available amylases are Natalase™. 
Supramyl™, Stainzyme™, Duramyl™. Termamyl™, Fungarnyl™ and BAN™ (Novozymes A/S), 

20 Rapidase™ and Purastar™ (from Genencor International Inc.). 

Suitable cellulases include those of bacterial or fungal origin. Chemically modified or 
protein engineered mutants are included. Suitable cellulases include cellulases from the 
genera Bacillus, Pseudomonas. Humicola, Fusarlum, TWe/av/a, Acremontym, e.g. the fungal 
cellulases produced from Humicola insolens, Myceliophthora thermoptfila and Fusarium 

25 oxysporum disclosed in US 4,435,307, ;US 5,648,263, US 5,691,178, U§ii5,776 t 757 and WO 
89/09259. Especially suitable cellulases are the alkaline or neutral cellu|ases having colour 
care benefits. Examples of such cellulases are cellulases described! if]5iEP 0 495257, EP 
531372, WO 96/11262, WO 96/29397, WO 98/08940. Other examples a?e cellulase variants 
such as those described in WO 94/07998, EP 0 531 315, US 5,457,046* ; US 5,686.593. US 

30 5,763,254, WO 95/24471, WO 98/12307 and WO 99/01544. Commercially available 
cellulases include CelluzymeTM, and CarezymeTM (Novozymes A/S), ClazinaseTM, and 
Puradax HATM (Genencor International Inc.). and KAC-500(B)TM (Kao Corporation). 

Suitable peroxidases/oxidases include those of plant, bacterial or fungal origin. 
Chemically modified or protein engineered mutants are included. Examples of useful 

35 peroxidases include peroxidases from Coprinus, e.g. from C. c/nereus. and variants thereof as 
those described in WO 93/24618, WO : 95/10602, and WO 98/15257. Commercially available 
peroxidases include GuardzymeTM (Noyozymes). 

* if/ 
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The detergent enzyme(s) may; be included in a detergent con^osition by adding 
separate additives containing one or } more enzymes, or by adding a. combined additive 
comprising all of these enzymes. A detergent additive of the invention, i.e. a separate additive 
or a combined additive, can be formulated e.g. as a granulate, a liquid, a slurry, etc. Preferred 
5 detergent additive formulations are granulates, in particular non-dusting granulates, liquids, in 
particular stabilized liquids, or slurries. . 

Non-dusting granulates may be produced, e.g., as disclosed in US 4,106,991 and 
4,661 ,452 and may optionally be coated by methods known in the art. Examples of waxy 
coating materials are ,poly(ethylene oxide) products (polyethyleneglycol, PEG) with mean 
10 molar weights of 1000 to 20000; ethpxylated nonylphenols having from. ,16 to 50 ethylene 
oxide units; ethoxylated fatty alcohols in which the alcohol contains froftji 12 to 20 . carbon 
atoms and in which there are 15 to 80 ethylene oxide units; fatty alcohpls; fatty acids; and 
mono- and di- and triglycerides of fatty acids. Examples of filnvformina coating materials 
suitable for application by fluid bed techniques are given in GB 1483§?1. Liquid enzyme 

m 

15 preparations may, for instance, be stabilized by adding a polyol such a^ propylene glycol, a 

> • 

sugar or sugar alcohol, lactic acid or bpric acid according to established jjiethods. Protected 
enzymes may be prepared according to.the method disclosed in EP 238216;. 

The detergent composition of the invention may be in any convenient form, e.g., a bar, 
a tablet, a powder, a granule, a paste or a liquid. A liquid detergent may be aqueous, typically 
20 containing up to 70 % water and 0-30 % organic solvent, or non-aqueous. 

The detergent composition comprises one or more surfactants, which may be non-ionic 
including semi-polar and/or anionic and/or cationic and/or zwitterionic. The surfactants are 
typically present at a level of from 0.1 % to 60% by weight. 

When included .therein the detergent will usually contain from about 1% to about 40% 
25 of an anionic surfactant such as linear alkylbenzenesulfonate, alpha-olefinsulfonate, alkyl 
sulfate (fatty alcohol sulfate), alcohol ethoxysulfate, secondary alkanesujfonate, alpha-sulfo 
fatty acid methyl ester, alkyl- or alkenylsuccinic acid or soap. • ■ A\ j 

When included therein the detergent will usually contain from aboutjf).2% to about 40% 

■El » 

of a non-ionic surfactant such .as alcohol elhoxylate. nony&henol ethoxylate, 
30 alkylpolyglycoside, alkyldimethylamineoxide, ethoxylated fatty add mono^thanolamide, fatty 
acid monoethanolamide, polyhydroxy qlkyl fatty acid amide, or N-acyl Nj^lkyl derivatives of 
glucosamine ("glucamides"). ■ \ 

The detergent may contain 0-65;% of a detergent builder or complexing agent such as 
zeolite, diphosphate, triphosphate, phosphonate, carbonate, citrate, nitrilotriacetic acid, 
35 ethylenediaminetetraacetic acid, diethylenetriaminepentaacetic acid, alkyl- or alkenylsuccinic 
acid, soluble silicates or layered silicates (e.g. SKS-6 from Hoechst). 

The detergent may comprise one or more polymers. Examples are 
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carboxymethylcellulose, polyvinylpyrrolidone), poly (ethylene glycol), ?pbly(vinyl alcohol), 
poly(vinylpyridine-N-oxide), poty<vinylimidazole), polycarboxylates suchivias potyacrylates. 
maleic/acrylic acid copolymers and lauryl methacrylate/acrylic acid copolyrriers. 

The detergent pnay contain a bleaching system which may comprise a H202 source 
5 such as perborate or percarbonate which may be combined with a peracid-forming bleach 
activator such as tetraacetytethytenediamine or nonanoyloxybenzenesulfonate. Alternatively, 
the bleaching system may comprise peroxyacids of e.g. the amide, imide, or sulfone type. 
! The enzyme(s) of the detergent composition of the invention may be stabilized using 

conventional stabilizing agents, e.g., a polyol such as propylene glycol or glycerol, a sugar or 
10 sugar alcohol, lactic acid, boric acid, or a boric acid derivative, e.g., an aromatic borate ester, 
or a phenyl boronic acid derivative such as 4-formylphenyl boronic acid, and the composition 
may be formulated as described in e.g. WO 92/19709 and WO 92/19708. 

The detergent may also contain other conventional detergent ingredients such as e.g. 
fabric conditioners including clays, foam boosters, suds suppressors, anti-corrosion agents, 
15 soil-suspending agents, anti-soil redeppsition agents, dyes, bactericides,(Xjptica! brighteners, 
hydrotropes, tarnish inhibitors, or perfumes. : f{ S 

It is at present contemplated that in the detergent compositions any enzyme, in 
particular the enzyme of the invention, may be added in an amount corresponding to O.01-100 
mg of enzyme protein per liter of wash liqour, preferably 0.05-5 mg of enzyme protein per liter 
20 of wash liqour, in particular 0.1-1 mg of enzyme protein per liter of wash liqour. 

The enzyme of the invention jrnay additionally be incorporated in the detergent 
formulations disclosed in WO 97/07202. 

Method for Generating Protease Variants 
25 The invention also relates to a method for generating a protease variant of an 

improved property, the method comprising the following steps: 

(a) selecting a parent protease of at least 60% identity to SEQ ID NO: 2; 

(b) establishing a 3D structure of the parent protease by homology modelling using 
the Fig. 2 structure as a model; and/or aligning the parent protease according to the alignment 

« % m 

30 of Fig. 1; . 

(c) proposing at least one amino acid substitution, e.g. by: . tt \ 
(i) subjecting the 3D structure of (b) to MD simulations at increased 

temperatures, and identifying regions in the amino acid sequence of the parent 
protease of high mobility (isotropic fluctuations); ; ; i 

35 (ii) introducing disulfid bridges by way of cysteine substitutions (C-C); 

(tii) introducing proline; substitutions (P); 

(iv) replacing exposed neutral amino acid residues with negatively charged 

30 
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amino acid residues (E,D); 

(v) replacing exposed neutral amino aicd residues wittypositivety charged 

amino acid residues (R,K); : ■:: 

(vi) replacing small amino acid residues inside the proteiijiwith bulkier amino 

s acid residues (W); : 5$ ! 

> • 

(vii) comparing by homology alignment and/or hojnology modelling 

* - 

according to step (c)(i) at least* two related parent proteases and; transferring amino 
acid residue differences inbetween these protease backbones, preferably from a 
backbone having the improved, property to a backbone not having this improved 
10 properly; 

(d) preparing a DNA sequence encoding the parent protease but for inclusion of a 
DNA codon of the at least one amino acid substitution proposed in steps (c)(ii)«(c)(vii), or 
subjecting the parent DNA sequence to random mutagenesis, targetting at least one of the 
regions identified in step (c)(i); 
15 (e) expressing the DNA sequence obtained in step (d) in a host cell, and 

(h) selecting a host cell expressing a protease variant with an improved property. 
The invention furthermore relates to a method for producing £ protease t variant 

obtainable or obtained by the method of generating protease variants : described above, 
comprising (a) cultivating the host cell to produce a supernatant comprising the variant; and 
20 (b) recovering the variant. f& 

* 

Alternative Embodiment : > 

In an alternative embodiment, tt]e term. "alteration" is used instead of 'substitution" as 
the general term for amendments in the protease molecule. This alternative embodiment 
25 includes each of the claims formulated as exampllfied below for claim 1 , and also specifically 
includes everything what is stated herein, e.g. definitions (other than the definition of 
substitution), i.e. the various aspects, particular embodiments etc. 

A variant of a parent protease, comprising an alteration in at least one position of at 
least one region selected from the group of regions consisting of: 
30 6-18; 22-28; 32-39; 42-58; 62-63; 66-76; 78-100; 103-106; 111-114; 118-131; 134-136; 139- 
141; 144-151; 155-156; 160-176; 179-181; and 184-188; wherein 

(a) the alteration^) are independently , ^ { 

(i) an insertion of an amino gcid immediately downstream of thjg,position, 
(ii) a deletion of the amino add which occupies the position, snd/or 

35 (iii) a substitution of the amino acid which occupies the position; ; 

(b) the variant has protease activity; ,and #vj 

(c) each position corresponds to a position of SEQ ID NO: 2; and . ; { 

j 31 I 
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(d) the variant has a percentage of identity to SEQ ID NO: 2 of at least 60%. 

The term "polypeptide variant", "protein variant", "enzyme variant", "protease variant" or 
simply -variant" refers to a polypeptide :of the invention comprising one or more alteration(s), 
such as substitution(s), insertion(s), dejetion(s); and/or truncation(s) of ope or more specific 
5 amino acid residue(s) in one or more specific positions) in the polypeptide^ , 

The term "parent polypeptide", '|parent protein", "parent enzyme\£standard enzyme", 
"parent protease" or simply "parent" refers to the polypeptide on which :the^yariant was based. 
This term also refers to the polypeptide with which a variant is compared and aligned. . 

The term "randomized library", Variant library", or simply "library" J^fers to a library of 
10 variant polypeptides. Diversity In the variant library can be generated via mutagenesis of the 
genes encoding the variants at the DNA triplet level, such that individual codons are 
variegated e.g. by using primers of partially randomized sequence in a PCR reaction. Several 
techniques have been described, by which one can create a diverse combinatorial library by 
variegating several nucleotide positions in a gene and recombining them, for instance where 
15 these positions are too far apart to be covered by a single {spiked or doped) oligonucleotide 
primer. These techniques include the use of in vivo recombination of the individually diversified 
j gene segments as described in WO 97/07205 on page 3. lines 8 to 29 (Novozymes A/S). They 

also include the use of DNA shuffling techniques to create a library of full length genes, 
wherein several gene segments are combined, and wherein each segment* may be diversified 
I 20 e.g. by spiked mutagenesis (Stemmer, Nature 370. pp. 389-391, 1994 and US 5,81 1,238; US 

5,605.793; and US 5,830,721). One can use a gene encoding a protein "backbone" (wildtype 
parent polypeptide) as a template polynucleotide, and combine this with p^e or more single or 
| double-stranded oligonucleotides as described in WO 98/41623 and*; in WO 98/41622 

(Novozymes A/S). The single-stranded oligonucleotides could be partially.randomteed during 
25 synthesis. The double-stranded oligonucleotides could be PCR products incorporating 

■ 

diversity in a specific region. In both cases, one can dilute the diversity ^yvith corresponding 

* » 

segments encoding the sequence of the backbone protein in order to limit the average number 
of changes that are introduced. 

Methods have also been established for designing the ratios of nucleotide mixtures (A; 
30 C; T; G) to be inserted in specific codon positions during otigo- or polynucleotide synthesis, so 
as to introduce a bias in order to approximate a desired frequency distribution towards a set of 
one or more desired amino acids that will be encoded by the particular codons. It may be of 

i 

interest to produce a variant library, that comprises permutations of a number of known amino 
acid modifications in different locations in the primary sequence of the;? polypeptide. These 
35 could be introduced post-translationally or by chemical modification sites, or they could be 
introduced through mutations in the encoding genes. The modifications ;Jay themselves may 

t -V. . 

previously have been proven beneficial for one reason or another (e.g. decreasing antigenicity, 

* :' ' ' 

32 " h- 



Best Available Copy 

* 

10508.010-OK 

or improving specific activity, performance, stability, or other characteristics). In such 
instances, it may be desirable first te> create a library of diverse combinations of known 
sequences. For example, if twelwe Individual mutations are known, one could combine (at 
least) twelwe segments of the parent protein encoding gene, wherein each segment is present 
in two forms: one with,; and one without .the desired mutation. By varying the relative amounts 

■ - ■ 

■ 

of those segments, one could design a,library (of size 212) for which the average number of 
mutations per gene can be predicted. This can be a useful way of combining mutations, that 

libraries, as 



by themselves give some, but not sufficient effect, without resorting to very large 

U% a.?4aa 4Ua u>Ka» i^tiMk 'mSI/a^ mtt^nafN&eie' A.nrtthf&r tAiaw to rnmhinfi iY 



is often the case when using 'spiked Mutagenesis'. Another way to combine these 'known 
10 mutations' could be by using family shuffling of oligomeric DNA encoding the known mutations 
with fragments of the full length wild type sequence. 

In describing the various variants produced or contemplated according to the invention, 
a number of nomenclatures and conventions are used which are described In detail below. A 
frame of reference is first defined by aligning the variant polypeptide with a parent enzyme* A 
15 preferred parent enzyme is Protease 10 (SEQ ID NO: 2). Thereby a number of alterations will 
be defined in relation to the amino acid sequence of SEQ ID NO: 2. 
A substitution in a variant is indicated as: 
Original amino acid - position - substituted amino acid; 

The three or one letter codes are used, including the codes Xaa and X to indicate any 
20 amino acid residue. Accordingly, the notation ^825" or Th^Ser" means, that the variant 
comprises a substitution of threonine with serine in the variant amino acid position 

4 

corresponding to the amino acid in position 82 in the parent enzyme, wheri^he two are aligned 

; is 
as indicated above. j 

Where the original amino acid rbsidue may be any amino acid re$jfaue, a sho'rt hand 
25 notation may at times be used indicating only the position, and the substituted amino acid, for 
example: 1 :. ff * 

Position - substituted amino add; or "82S n . £ j 

Such a notation is particular relevant in connection with modificatjc|n(s) in a series of 

homologous polypeptides. , 
30 Similarly when the Identity of the substituting amino acid residue(s) is immaterial: 

Original amino acid - position; or T82" 

When both the original amino acid(s) and substituted amino acid(s) may be any amino 
acid, then only the position is indicated, e.g.: "82". 

When the original amino acld(s) and/or substituted amino acid(s) may comprise more 
35 than one, but not all amino add(s), thenlhe amino acids are listed separated by commas: 

Original amino adds - position np. - substituted amino acids; or "TIOE^.Y". 
A number of examples of this nomenclature are listed beiow: . !j 
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The substitution of threonine for histidine in position 91 is designated' as: n His91Thr or 
"H91T"; or the substitution of any amino acid residue acid for histidine in position 91 is 
designated as: "His91Xaa u or "H91X n or n His9r or "H9T. 

For a modification where the original amino acid(s) and/or substituted amino acid(s) 
5 may comprise more than one. but not all amino acid(s). the substitution of glutamic acid, 
aspartic acid, or tyrosine for threonine in position 10: 

■Thr10Glu,Asp.Tyr or "T10EAY"; which indicates the specific variants: , T10E , \ 
"TIOD". and "nov. 

A deletion of glycine in position 26 will be indicated by: "Gly26*" or "G26*" 
10 Correspondingly, the deletion of more than one amino add residue, such as the 

deletion of glycine and glutamine in positions 26 and 27 will be designated';Gly26*+Gln27*" or 
"G26*+Q27*" i W 

The insertion of an additional amino acid residue such as e.g/a jlysine after G26 is 
indicated by: n Gly26GlyLys" or ^eGK"; or, when more than one amino acid residue is 
15 inserted, such as e.g. a Lys, and Ala after G26 this will be indicated as: "GJy26GlyL.ysAla" or 
"G26GKA". / ,£» 

In such cases the inserted amino acid residue(s) are numbered by the addition of lower 
case letters to the position number of the amino acid residue preceding the inserted amino 
acid residue(s), In the above example the sequences would thus be: 
20 Parent: Variant 

26 26 26a 26b 

G G K A 

In cases where an amino acid residue identical to the existing amino acid residue is 
inserted, it is dear that degeneracy in 1 the nomenclature arises. If for example a glycine is 
25 inserted after the glycine in the above example this would be indicated by "p26GG n . 

Given that an alanine were present in position 25, the same actual change could just 

•i 



as well be indicated as "A25AG": 

Parent: Variant: 



rs 

Numbering I: 25 26 25 26 26a ^ 

30 Sequence: AG A G G 

Numbering II: ; 25 25a 26 

Such instances will be apparentto the skilled person, and the indication "G26GG" and 
corresponding indications for this type of insertions is thus meant to comprise such equivalent 
degenerate indications. 

35 By analogy, if amino acid sequence segments are repeated in the parent polypeptide 

and/or in the variant, it will be apparent to the skilled person that equivalent degenerate 
indications are comprised, also when, other alterations than insertions are listed such as 

34 
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deletions and/or substitutions. For instance, the deletion of two consecutive amino acids "AG" 
in the sequence "AGAG" from position 194-197, may be written as ^AW+GiBSS" or 
"A196*+G197*": ". 

Parent: Variant: "l" 

5 Numbering I: 194 195 196 197 194 195 

Sequence: AGAG AG 
Numbering II: 196 197 

Variants comprising multiple modifications are separated by pluses, e.g.: 
u Argl70Tyr+Gly195Glu" or "R170Y+G195E". representing modifications in positions 170 and 
10 195 substituting tyrosine and glutamic acid for arginine and glycine, respectively. Thus, 
Tyr167Gly>Ma.Ser.Tnf+Arg170Gly l Ala I Ser,Thr" designates the following variants: 
Tyr167Gly+Arg170Gly". Tyr167Gly+Arg170Ala", "Tyr167Gly+Arg170Ser, 
Tyr167Gly+Argl70Thr", "Tyr167Ala+Ar9l70Gly". "Tyr167Ala+Arg170AlaV : 
"Tyr167Ala+Arg170Ser". Tyr167Ala+Arg170Thr-, Tyr167Ser+Arg170GIy".; 
15 Tyr167Ser+Arg170Ala", Tyr167Ser+Arg170Ser n .Tyr167Ser+Arg170Thr", 

Tyr167Thr+Argl70Gly\ Tyr167Thr+Arg170Ala", "Tyr167Thr+Arg170Ser" ? ,and 
"Tyrl67Thr+Arg170Thr". : 

This nomenclature Is particular relevant relating to modifications aimed at substituting, 
inserting or deleting amino acid residues having specific common properties, such 
20 modifications are referred to as conservative amino acid modification^). 

( 

The present invention is further described by the following examples which should not 
be construed as limiting the scope of the invention. 

25 Examples 

Example 1: Protease assays 
pNA assay 

pNA substrate : Suc-AAPF-pNA (Bachem L-1400). 

r. 

Temperature : Room temperature <25°C) 
30 Assay buffers :100mM succinic acid. 100mM HEPES. 100mWI CHES. 100mM CABS, 

1mM Cad* 150mM KCI. 0.01% Triton X-100 adjusted to pH-vatues 2.Q. 2:5. 3.0. 3.5. 4.0. 5.0, 
6.0. 7.0, 8.0, 9.0. 1 0.0. 1 1 .0. and 1 2.0 with HCI or NaOH. 

20pl protease (diluted in 0.01% Triton X-100) is mixed with 100^1 assay buffer. The 
assay is started by adding 100ul pNA Substrate (50mg dissolved in 1.0ml OMSO and further 
35 diluted 45x with 0.01% Triton X-100). The increase in OD 40 s is monitored as a measure of the 
protease activity. 

35 
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From these results it appears that Protease 22 has a higher temperature optimum at 
pH 9 than the 10R protease, viz. around 80°C as compared to around 70°C. 

Differential Scanning Calorimetry (DSC) was used to determine temperature stability at 
5 pH 7.0 of Protease 22 and Protease 10. The purified proteases were dialysed over night at 
4°C against 10 mM sodium phosphate, 50 mM sodium chloride, pH 7.0 and run on a VP-DSC 
instrument (Micro Cal) with a constant scan rate of 1.5°C/min from 20 to 100°C. Data-handling 
was performed using the MtcroCal Origin software. 

The resulting denaturation or melting temperatures, Tm's, were; For Protease 22: 

■ 

10 83.5°C; for Protease 10: 76.5X. .-j 

■» 

The invention described and daijned herein is not to be limited in spppe by the specific 
embodiments herein disclosed, since these embodiments are intended as illustrations of 
several aspects of the invention. Any Equivalent embodiments are intended to be within the 
scope of this invention. Indeed, various modifications of the invention in addition to those 
15 shown and described herein will become apparent to those skilled in the art from the foregoing 
description. Such modifications are also intended to fail within the scope of the appended 
claims. In the case of conflict, the present disclosure including definitions will control. 

Various references are cited herein, the disclosures of which are incorporated by 
reference in their entireties. 



« 
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Claims 

■ 

1. A variant of a parent protease, comprising a substitution in at least one position of at 
least one region selected from the group of regions consisting of: 

6-18; 22-28; 32-39; 42-58; 62-63; 66-76; 78-100; 103-106; 111-114; 118r131; 134-136; 139- 
5 141; 144-151; 155-156; 160-176; 179-181; and 184-188; wherein 
(a) the variant has protease activity; and 

<b) each position corresponds to a position of SEQ ID NO: 2; and { 

(c) the variant has a percentage of identity to SEQ ID NO: 2 of at least -60%. . 

; ; f 

P 4 

10 2. The variant of claim 1 which comprises a substitution in at least one of the following 

positions: 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 22; 23; 24; 25; 26; 27; 28; 32; 33; 34; 

35; 36; 37; 38; 39; 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52; 53; 54; 55; 56; 57; 58; 62; 63; 66; 

67; 68; 69; 70; 71; 72; 73; 74; 75; 76; 78: 79; 80; 81; 82; 83; 84; 85; 86; 87; 88; 89; 90; 91; 92; 

93; 94; 95; 96; 97; 98; 99; 100; 103; 104; 105; 106; 111; 112; 113; 114; 118; 119; 120; 121; 
15 122; 123; 124; 125; 126; 127; 128; 129; 130; 131; 134; 135; 136; 139; 140; 141; 144; 145; 

146; 147; 148; 149; 150; 151; 155; 156; 160; 161; 162; 163; 164; 165; 166; 167; 168; 169; 

170; 171; 172; 173; 174; 175; 176; 179; 180; 181; 184; 185; 186; 187; and/or 188. 

} 

3. The variant of claim 2 which comprises a substitution in at least one pi the following 
20 positions: 6; 7; 8; 9; 10; 12; 13; 16; 17; 18; 22; 23; 24; 25; 26; 27; 28; 32; 33; 37; 38; 39; 42; 

43; 44; 45; 46; 47; 48; 49; 50; 51; 52; 53; 54; 55; 55; 58; 62; 63; 66; 67; 68; 69; 70; 71; 72; 73; 
74; 75; 76; 78; 79; 80; 81; 82; 83; 84; 85; 86; 87; 88; 89; 90; 91; 92; 93; 94; 95; 96; 97; 98; 99; 
100; 103; 105; 106; 111; 113; 114; 118; 120; 122; 124; 125; 127, 129; 130;;tf31; 134; 1.35; 

* 9 * f 

136; 139; 140; 141; 144; 145; 146; 147; 148; 149; 150; 151; 155; 156; 16Q;;161; 162; 163; 
25 164; 165; 166; 167; 168; 169; 170; 171;' 172; 173; 174; 175; 176; 179; 180| 181; 184; 185; 
186; 187; and/or 188. 

4. The variant of claim 3, which comprises at least one of the following substitutions: 6C; 
7P; 8C; 9C; 10E.D; 12E.D; 13E.D.P; 16C; 17C; 18C; 

30 22A.C,D,E.F,G 1 H,I,K,L,M,N.P,Q,R.S,V.W,Y:23A,C,D,E,F.G.H.I.K.L.M,P,Q.R.S.T.V.W,Y; 
24C.D,E,F.G,H,I.K,L,M.N.P.Q,R,T.V,W.Y; 25C,D,E.F,G.H,I,K.L,M,N,P.Q,R.T,V,W.Y; 
26A,C,D,E r F,H,I.K,L,lvl,N 1 P,Q,R,S,T.V.VV.Y;27A,C > D,E.F,G.H 1 l,K,UM,N.P.R,S,T,V,W,Y; 

28A,C,D,E,F,G.H,I,K,L.M,N,Q.R.S,T,V.W.Y; 32C; 33C; 37C; 39R.K; 42E.D; 
43A.C,D,E,F,G.HJ.K,L,M.N,P,Q,R.S.T.W,Y:44A,C,D 1 E.F,G,H,l,K.L > M,N,P i «p.R.V,W.Y; 
35 45A.C.D,E.F.G.H,K,L,M.N,P,Q.R.S.T.V < W 1 Y; 46A,C,D.E,F,H,l,K,L.M,N.P.%R,S.T.V.WiY; 
47A,C.D > E > F.G.H > I.KX,M,P,Q.R.SJ.V,^,Y;48A.C.D.E,F.H,I.K,L.M.N,P,p i R,S > T 1 V.W.Y; 
49A,C.D,E.F,G.H.I.K.UM,N.P.S t V,W,Y; ; 50A,C.D f E l F.H,I.K 1 L,M,N.P 1 Q.R,^,V,W,Y; 52C; 
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5SC; 56R.K; 5BE.D; 62C; 63C; 66^AE.F AH^K.LM^P.Q.S.T.V.W.Y; 
67A,C,D,E,F,H,l,K,L,M,N,P,Q,R,S 1 T.V l Vy.Y;68A,C I D.E,F,G,H,l,K,L.M.N,P,Q.R,S.V,W.Y; 
69A.C 1 D,E 1 F,G 1 H,I,K,L.M,N,P,Q,R.T.V,W,Y;70A.C,D 1 E 1 F,G.H,I,K,L.M,P,Q.R.S.T.V.W.Y; 
71A t C,D,E,G.H,l,K,L,M,N,P,Q,R.SJ,V,W,Y; 72A,C»D,E,F.G,H,J > K,L,M,N,P,Q.R,S,V.W,Y; 

■ 

5 73A,C,D,E,F,G I HJ,K,M.N.P,Q,R.S I T,V.yV ( Y;74A,C,D,E,F.G.H,l,K,L,M,N,P,Q,R,S,V,W,Y; 

75A,C,D,E,F,G,HJ,KX.MP,Q.R.SXV,\^,Y;76C;78A,C,0 1 E,F ? G.HJXL,M.N,P,Q,R,T,V,W.Y; 

79A,C.D,E.F > G,HJ.K.L.M.NP,Q,SJ.V.W.Y;80A.C.D.E.F 1 G.H.I,K.L,M.N.P 1 g,R.S I T,V,W; 

81A.C,D.E.F.G 1 HJ,KX.M.P 1 Q,R,SJ,V,W,Y;82A.C.D,E.F,G,H,I.K,L.M,N,P [ Q.R.V.W.Y; 

83A.C,D,E.F.HJ.KX.M.NP.Q.R,5JXW.Y;84A.C.D.E,F.HJ,K,L.M,N,P,Q^R,S.T.V,W,Y; 
10 85AAD.E,F ) G,HJ,K 1 L,M,NP,Q,R.STXW;86C,D,E,F I G.HJ,K,L,M,N,P > R J S.T.V,W.Y; 

87A.C,D,E.F,G I HJ,K.L,M.NP,Q,RXW>;88A,C,D,E,F,G,HJ,K,L.M.N,P.aR.S.T.W,Y; 

89C,D 1 E,F,G.HJ.K I L,M,N.P,Q.R i V,W,Y;90A,C 1 D,E,F,H l l 1 K,L.M,N > P,Q,R.S 1 T.V,W,Y; 92P.R.K; 

93P; 94C.P; 95E.D; 96E.D.P; 97R.K; 98P; 99R.K; 103C; 10SC.P; 10BC; 111R.K; 113E.O; 

118R.K; 120E.D; 122K; 124R.K; 125P; 127R.K; 129E.D; 130E.D; 134C; 135P; 136P; 139C; 
15 140E.D; 141C; 144C; 145C; 146C; 147W; 148C; 149C: 150E.D; 151P.E.D; 155C; 156C; 

160A,C,D,E,F.HJ.K^.M,N.P.Q,R.S,T,V,W.Y;161A 1 C,D,E.F.G,H,I,K,L,M,N,P,Q.R,T.V.W.Y; 

162A,C.D,E,F,H,J,K,L,M,N,P,Q.R,S.T.V,W.Y; 163A.C.D,E,F,G l H,l,K,L i M,P,Q.R,S l T,V,W,Y; 
164A,D 1 E I F,G,H,I.K.L,M,N,P.Q,R,S.T,V,W,Y; 165A,C,D,E,F,G 1 H,I.K.L.M,N,P,Q.T.V,W,Y; 
166A.C,O.E,G,H,I.K,L,M.N,P.Q.R,S.W.Y; 167A,C.D,E,F.H.I,K,L,M.N,P,Q.R,S.T.V,W.Y; 
20 168A.C,0,E ) F,HJ t K,L,M,N,P.Q,R,S,T,V^W,Y;169A.C,D l E.F l G,H,l,K,L,M,N,P.Q.R,S,V.W,Y; 
170A.C,D,E.F f G,H,l,K,L.M,N,P,Q,R,S.VlW,Y; 172C; 173C; 174P; 175P; 176P; 180R.K; 
1 81 R,K; 184P; 187P; and/or 188R.K. . \] , 

5. The variant of any one of claims, 1-4 which comprises at least one $ the following pairs 
25 Of substitutions: 6C+103C; 8C+105.C; 76C+85C; 94C+149C; 55(5;*63C; 16C+145C: 

33C+144C; 62C+173C; 106C+141C; 9C+17C; 18C+156C; 32C+144C; 37C+52C; 67C+71C; 

* * * 

134C+170C; 139C+163C; 146C+148C; and/or 1S5C+172C. 

6. The variant of claim 5, which comprises at least one of the following pairs of 
30 substitutions: 6O103C; 8O105C; 76C+85C; 94C+149C; 55C+63C; 16C+145C; 33C+144C; 

62C+173C; and/or 106O141C. 

7. The variant of any one of claims 1-4 which comprises at least one of the following 
substitutions: 81P; 82F>; 151P; 176P; 24P; 25P; 92P; 93P; 94P; 96P; 98P; 105P; 136P; 184P; 

35 187P; 174P; 7P; 13P; 23P; 27P; 125P; 135P; and/or 175P. 

s 

• !.'< 

8. The variant of claim 7, which comprises at least one of the following; substitutions: 81 P; 

M' \ 
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82P; 151P; 176P; 24P; 25P; 92P; 93P; ?4P; 96P; 98P; 105P; 136P; 184P; and/or 187P. 

9. The variant of any one of claims 1-4 which comprises at least one of the following 
substitutions: 95E.O; 42E.D; 84E.D; 96E.D; 47E.D; 46E.D; 150E.D; 70E.D: 13E.D; 140E.D; 

5 10E.O; 151 E.D; 129E.D; 130E.D; 166E.D; 161 E,D; 120E.D; 82E.D; 58E.D; 12E.D; 81E.D; 

69E.D; 113E.D; 89E.D; and/or 160E.O. " 

i 

10. The variant of daim 9 which comprises at least one of the following substitutions: 
95E.D; 42E.D; 84E.D; 96E.D; 47E.D; 46E.D; 150E.D; 70E.D; 13E.D; and/or 140E.D. 

10 * 

\r 

11. The variant of any one of claims 1-4 which comprises at least one of the following 
substitutions: 124R.K; 72R.K; 97R.K; 127R.K; 56R.K; 122R.K; 181Rik; 180R.K; 25R.K; 
92R.K; 39R.K; 99R.K; 111R.K; 24R.K; 118R.K; 162R.K; and/or 188R.K. 

* * ■ 

15 12. The variant of claim 11 which comprises at least one of the following substitutions: 
124R.K; 72R.K; 97R.K; 127R.K; 56R,K;,122R,K; 181R.K; 180R.K; 25R.K; and/or 92R.K. 

13. The variant of any one of claims 1-4 which comprises at least one of the following 
substitutions: 147W; 43W. 

20 

9 

14. The variant of any one of claims 1-4 which comprises a substitution in at least one 
position of at least one region selected from the group of regions consisting of: 

(i) 160-170, 78- 90, 43-50. 66-75. and 22-28; 

(ii) 160-170. 78-90, 43-50. and 66-75; , jr' 
25 (iii) 160-170, 78-90. and 43-50; T 

(iv) 160-170. and 78-90; and/or j ; : 

(v) 160-170. '!; 

1 5. The variant of any one of claims 1-4 which comprises at least one of the following 
30 substitutions: 6C; 8C; 13E.D; 16C; 24P; 25K.P.R: 33C; 42E.D; 46D.E; 47D.E; 55C; 56R.K; 

62C; 63C; 70D.E; 72K.R; 76C; 81P; 82P; 84D.E; 85C; 92P.R.K; 93P; 94C.P; 95E.D; 96E.D.P; 
97R.K; 98P; 103C; 105C.P; 106C; 122R.K; 124R.K; 127R.K; 136P; 140E.D; 141C; 144C; 
145C; 149C; 150E.O; 151P; 173C; 176P; 180R.K; 181R.K; 184P; and/or 187P. 

35 1 6. The variant of any one of claims 1 -3, which comprises at least one of the following 
substitulions: G6C; L7P; A8C; Y9C; T10E.D.Y; G12E.O; G13E.D.P; S16C; V17C; G18C; 
T22A,C,D,E.F 1 G,H,I,K.L,M,N,P,Q,R,S,V,\W,Y;N23A,C,D,E.F,G,H,».K,L,M.P,Q,R,S.T,V,W,Y; 

I 40 
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A24C.D f E,F,G,H.I,K,L,M,N.P,Q.R,S.T,V,W I Y (preferably A24S); y 
A2SC 1 0 f E,F,6,H 1 I.K,L,M,N.P,Q.R 1 S,T,V,W,Y (preferably A25S); ' ( 

G26A,C,D.E.F.H,I.K,L,M.N.P.Q.R.S,T.v!.W.Y; Q27A.C,D.E,F,G,H,I 1 K,L,M.NJ'.R.S.T.V.W.Y; 
P28A,C.D,E.F,G,H,I.K,L,M,N,Q,R.S,T,V,W,Y; T32C; A33C; G37C; R38T; V39R.K; 
5 Q42E.D.G.P; V43A,C,D.E.F.G.H,I,K.L,M,N,P,Q.R.S,T.W.Y; 
T44A,C < D t E.F,G,H,l,K l L.M,N,P,Q,R,S,V,W r Y (preferably T44S); 

l45A,C.D,E,F > G.H.K,L 1 M f N.P,Q,R.S.T.V,W,Y;G46A.C.D,E,F,H.l 1 K,L,M.N,P.Q.R,S.T.V.W,Y; 
N47A,C,D,E,F.G,H,l,K,L f M,P,Q,R,S,T,V,W,Y; G48A,C,D 1 E,F.H,I,K,L,M,N,P.Q,R,S.T.V.W,Y; 
R49A.C.D.E.F.G.H,I,K.L,M,N,P.Q.S.T,V,W,Y (preferably R49T.Q); 
10 GSOA.C,D.E.F.H,l,K.L I M,N.P.Q.R.S,T.V,W.Y; V51T; F52C; E53Q; Q54N.R; S55C; V56I.R.K; 
P58E.D; A62C.S; A63C: R66A.C,D,E ( F.G,H.I,K,L.M.N,P.Q.S.T.V,W,Y; 
G67A 1 C,D.E.F,HJ,K^ > M,NP.Q.R.S.T.V.W,Y;T68A,C,D.E,F.G.H,I,K 1 L,M 1 N,P,Q,R,S.V.W.Y: 
S69A,C,D.E/,G.HJ > KX.M,N,P,Q > RT,V,W,Y;N70A.C.D,E,F,G,H > l,K,L,M.P i Q.R.S.T.V.W.Y; 
F71A.C.D.E,G.HJ,K,L.M,N,P,Q.R > SJXW,Y;T72A.C,D.E,F,G,H,NK,L l M l NiP,Q.R,S,V,W,Y; 

t 

15 " U73A.C,D,E,F,G,H.I.K 1 M,N.P.Q,R.S,T.V,W > Y; T74A.C.D,E,F 1 G,H.I,K,L.M.N : .P.Q.R.S.V,W,Y; 
N75A,C 1 D,E.F 1 G,H,I,K,L,M,P.Q,R,S,T,V,W,Y; L76C; 

S78A 1 C,D,E.F > G I HJ 1 K,L.M,N.P,Q,R,T,V,W.Y;R79A,C,D.E.F,G,H.I.K,L,M,^.P.Q,S.T,V.W.Y: 

Y80A,C,D.E,F,G,H,I,K,L,M I N,P,Q,R,S,T,V,W; NB1A,C,D,E,F,G 1 H,I,K,L,M,P,Q,R,S,T.V.W.Y; 

T82A.C.D,E,F,G,H > I,K,L,M,N,P,Q,R,S,V,W,Y (preferably T82S); 
20 G83A,C.D,E,F,H,l,K 1 L 1 M,N.P,Q.R,S,T,\/,W f Y;G84A,C,D 1 E,F,H,l,K t L,M,N,P.Q,R 1 S 1 T.V,W,Y; 

Y85A I C,D,E,F ( G,H I I I K,L,M,N,P,Q,R,S.T,V.W;A86C,0,E,F,G,H,I,K,L.M,N.P,Q,R,S,T,V,W.Y 

(preferably A86Q); T87A.C,D.E,F,G,H,I.K,L,M,N.P,Q,R,S,V,W.Y (preferably T87S); 
V8BA,C > D,E.F.G.H,l l K,L,M.N t P,Q.R.S,T.W,Y;A89C.D,E.F.G.H 1 I.K.L.M.N,P,Q.R,S,T.V.W,Y 

(preferably A89T.S); G90A.C,D.E,F,H.I 1 K,L,M,N,P,Q,R,S,T,V.W.Y; H91T.S; N92P.R.K.S; 
25 Q93P; A94C.P: P95A.E.D: I96A,E,D,P; G97R.K; S98P; S99A.Q.R.K; V100I; S103C; S105C.P; 

T106C: C111R.K; T113E.D; 1114V; G118N.R.K; S120T.E.D; S122R.K; P124R.K; E125P.Q; 

T127R.K; T129E,D,Y,Q; N130E.D.S; M131L; T134C; T135P.N; V136P; EI39C; P140E.D; 

G141C; G144C; G145C; S146C; Y147F.W; I148C: S149C; G150E.D; T15tip,E,D,S; G155C; 

V156C; G160A,C 1 D,E.F,H,I,K,L.M,N,P,Q,R,S,T,V,W.Y; . v ;, 

30 S161A,C,D,E,F,G,HJ,K,L,M,NP,Q,RJMW,Y;G162A,C,D,E.F,HJ,K,L l M,r!J,P,Q,R,SJ.y,W^^ 

N163A,C,D,E,F,G,H,I,K,L,M,P.Q.R,S,T 1 V,W,Y; C164A l O l E,F,G,H,l,K,L l M l N,P,Q.R > S > T.V,W,Y; 

* 

Rl65A.C.D,E.F,G r H,l I K.L.M.N.P.Q,S,T,V.W,Y (preferably R165S); :j ; 

T166A,C I D,F,E,G,H.I,K,L 1 M,N,P.Q,R,S 1 V.W,Y (preferably T166V.F); ! ; 
G167A,C,D 1 E,F,HJ.K.L.M,N,P.Q,R.SJ.V,W,Y;G168A,C.D.E.F.H,I.K,L.M,N>.Q.R.S,T,V I W,Y; 

35 Tl69A.C I D,E,F.G.H,l,K.L.M.N,P l Q.R.S t V,W.Y;T170A,C,D.E,F,G,H.I.K,L.M,N,P l Q,R,S.V,W,Y; 

F171Y; Y172C; Q173C; E174P; V175P; T176N.P; V179I.L; N180R.K.S; S181R.K; V184L.P; 

R1 B5T; L186I; R187P; and/or T188R.K; preferably at least one of the following: 

41 



1050B.010.DK i, J 

*>•» 

ij 

T10Y, A25S. R38T, Q42P, T44S. R49K, Q54R, V56I. AS2S, T82S, S99A, G118N, S120T, 
S122R, E125Q, T129Y, N130S, M131L, R165S, T166A, F171Y, T176N, V>179L. N180S. 
V184L, and/or R185T. t - 

■ • 

I 

s 1 7. The variant of any one of claims 1-3, which comprises at least one of the following 

substitutions: G6C; L7P; A8C; Y9C; Y10E,D,T; G12E.D; G13E.D.P; S16C; V17C; G18C; 
T22A.C.D,E,F,G.HJ,KX.M,N 1 P,Q,R.S.V.W,Y;N23A.C.D.E,F,G.H.I,K.L.M,P,Q.R.S,T,V,W,Y; 

S24A.C.D.E.F,G.H,I.K,L.M.N,P.Q.R.T,V,W.Y {preferably S24A); 

■ 

A25C.D,E.F.G,H.I,K.L.M.N,P.Q,R.S.T,V.W.Y (preferably A25S); 
10 G26A,C,D.E > F,H,I.K.L.M,N.P > Q,R,S,T,V.W,Y;Q27A.C,D.E,F,G,H.I,K,L.M.N.P.R.S.T,V.W,Y; 

P28A,C,D,E,F,G.H,I 1 K,L,M,N,Q > R,S,T,V.W,Y; T32C; A33C; G37C; T38R; V39R.K; 
G42E,D.P.Q:V43A.C.D.E.F.G.H,I,K.L.M,N,P,Q.R,S.T.W.Y; 
T44A,C,D.E.F.G.H.I,K > L.M.N.P.Q,R.S.V.W,Y (preferably T44S); 

l45A,C.D,E f F.G,H.KX.M,NP,Q.R,SJXW,Y;G46A,C,D,E > F,HJ,KX,M,N,P.Q.R,S.T.V.W,Y; 
15 N47A,C.D,E f F.G.HJ,K^,M,P,Q,R.SXV,W,Y;G48A.C,D,E,F.HJ.K.L.M > N.PiQ,R.S.T.V.VV,Y; 

T49A,C,D.E,F,G.H.I,K,L,M I N 1 P > Q,R,S,V ; W,Y (preferably T49R.Q); , { f 
G50A.C.D,E.F,H.I,K.L.M.N.P.Q.R.S.T.V>.Y;T51V; F52C; Q53E: N54Q,Ri;S55C; V56I.R.K; 
P58E.D; A62C.S; A63C; RB6A,C.D,E > F,G.H,I,K.L.M.N,P,Q 1 S,T,V.W,Y: 
G67A,C,D,E,F 1 HJ,K 1 L.M,N,P > Q.R,ST.V ; W,Y;T68A.C.O,E,F,G.HJ,KX,M,N^P,Q.R.S,V.W.Y; 
20 S69A.C,D,E,F.G.HJ,K.^,M,N,PARXV,W,Y;N70A,C.D.E,F.G.HJ.KX.M.P i ,Q.R.S,T,V.W,Y; 
F71A.C.D,E,G f H,l,K.L.M,N,P.Q,R l S.T.V } W.Y;T72A,C,O.E,F,G,H,l,K,L,M,N,P.Q l R,S,V,W,Y; 
L73A,C.D,E.F.G,H.I,K.M,N.P,Q > R 1 S.T,V,W.Y;T74A.C.D,E.F.G.H.I,K.L,M,N.P,Q.R.S,V.W,Y; 
N75A,C,D,E.F,G.H.I 1 K,L.M.P 1 Q.R,S,T.V.W.Y; L76C; 

S78A,C.D.E.F,G,H,I,K.L.M.N I P,Q.R,T,V.W.Y;R79A,C.D.E,F,G.H.I.K.L,M,N.P,Q.S.T.V.W,Y; 
25 Y80A,C.D,E > F,G.H,I,K.L,M,N.P,Q.R.S,T.V,W; N81A.C.D,E,F,G.H.I,K,L,M.P.Q.R,S.T.V.W,Y; 
S82A,C,0,E,F,G,H,I.K,L,M,N,P,Q,R,T,V.W,Y (preferably S82T); 

GB3A 1 C.D.E.F 1 HJ,KX.M,N f P,Q,R,S,T.V\W,Y;G84A,C.D.E,F.H.I.K.L.M,N,P,Q.R.S,T,V.W,Y; 
Y85A.C.D,E,F,G.H,I,K.L,M,N,P,Q,R,S,T,V,W;Q86A,C,D,E,F.G.H,I,K.L,M,N.P,R,S.T.V.W,Y 

(preferably Q86A); S87A 1 C,D,E,F,G,H,I,K,L,M.N,P,Q I R.T.V,W,Y (preferably iS87T); , 
30 V88A,C > D,EJ,G.H,I,KX,M,N 1 P,Q.R.SJ,W.Y;T89A.C.D.E.F,G,HJ.KX.M,N>,Q.R,S.V,W.Y 

(preferably T89A.S); G90A,C,D.E,F,H,I.K.L.M,N.P 1 Q,R.S,T,V,W,Y; T91 H.^ S92P.R.K,N; 

Q93P; A94C.P; P95A.E.D; A96E.D.I.P; G97R.K; S98P: A99R.K.S: V100I; § j!03C; S105C.P; 

T106C; C111R.K; T113E.D; 1114V; N118G.R 1 K; T120E.D,S; R122K.S; P124R.K; Q125E.P; 

T127R.K; Y129E.D.T; S130E.D.N; L131M; T134C; N135P.T; V136P; E13&5; P140E.D; 
35 G141C; G144C; G145C; S146C; F147W.Y; I148C; S149C; G150E.D; S15;ip.E,D,T; G155C; 

V156C; G160A 1 C,D,E.F,H,I.K,L,M,N,P.Q.R.S,T,V,W 1 Y; 

S161A,C,D,E,F t G,HJ.KXMNPAP^T,V,W l Y;G162A,C.D,E l F,HJ,KX.M,N,PAR.SJ,V,W,Y; 
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N163A,C.D.E,F.G 1 HJ,K,L.MP,Q,R3J.V.W 1 Y;C164A.D.E,F,G,H,I,K,L.M,^,P,Q,R.S,T,V,W,Y: 
S165A ( C.O.E.F,G.H.I,K.L.M.N.P.Q,R.T.y,W.Y (preferably S165R); Yj , 

V166A,C,D,E.G,H,I.K,L,M l N.P.Q,R,S.T.yV.Y (preferably V166F.T); , 
Gl67A.CAE,F l H,l,KXM > flPAR£.T i yW^ 
5 T169A,C,D,E,F.G,HJ ( KX.M,NP,Q.R.S,V,W,Y;T170A,C,D.E l F.G.HJ,KX.4kP.Q. R . s . v . vv - Y : 

■ 

Y171F; Y172C; Q173C; E174P; V175P; T176P.N; I179L;V N180R.K.S; SljBjIR.K; V184L.P; 
R185T; I186L; R187P; and/or T188R.K;, preferably j | 

S24A. A25S, G42P. T44S. T49K, T51V. Q53E, N54R, V56I. A62S, Q86A. S87T. T89A, T91H, 
S92N, A96I. N135T. F147Y, S151T, V166A, T176N, I179L, N180S, V184L, R185T. and/or 
10 I186L. 

1 8. The variant of any one of claims 1 -3, which comprises at least one of the following 
substitutions: G6C; L7P; A8C; Y9C; T10E.D.Y; G12E.D; G13E.D.P; S16C; V17C; G18C; 
T22A,C,D,E,F.C3,H l l,K > L,M 1 N,P,Q,R,S,V.W,Y;N23A f C,D f E ( F,G.H,l,K.L.M,P > Q,R.S.T,V.W.Y; 

15 A24C.D,E,F.G.H,I,K,L,M,N 1 P.Q.R.S,T,V,W,Y (preferably A24S); 

A25C,D,E,F.G,H,I.K,L.M,N 1 P,Q,R.S,T,V > W.Y (preferably A25S): ,| i 

G26A t C.D 1 Ep.HJ,K^,M,NP,QP,SXV t W.Y;Q27A,C.D.E.F t G.H.I,K,L.M.hJ^P.R.S.T.V,W,Y; 

P28A,C,D 1 E,F,G,H,I,K,L,M I N,Q,R,S,T,V}W,Y; T32C; A33C; G37C; R38T; V39R.K; 
Q42E.O.G.P; V43A,C.D,E.F.G,H f I.K,L 1 M,N,P,Q,R,S > T.W,Y; i : a, 

20 S44A,C,D.E,F,G.H.I.K.L.M,N f P.Q.R,T.VvW,Y (preferably S44T); ; £j 

l45A,C,D.EP,G > H,KX.M.N.P,Q,R.S t T.V.W,Y;G46A f C l D.E,F.H.I,K 1 L,M.N.^p,R.S.T.V.yV.Y; 
N47A.C,D,Ep,G.HJ,K,L,MP,QP,SXV,W.Y;G48A,C.DPP.HJ,KX,M,N.P,;p,R.S.T.V.W,Y; 
Q49A.C AE.F.G.H.I.K.L.M.N.P.R.S.T.V.W.Y (preferably Q49R.T); ■ . 

G50A.C,D.E.F,H.I,K,L,M,N,P.Q,R,S t T.V,W,Y; V51T; F52C; E53Q; Q54N.R; S55C; I56R.K; 

25 P58E.D; A62C.S; A63C; R66A,C,D,E,F,G,H,I,K > L.M.N,P.Q,S,T,V,W,Y; 

G67A,C,D,E,F,H.I.K 1 L,M,N,P,Q,R.S.T,V,W,Y;T68A,C,D.E,F,G.H.I,K.L.M,N.P,Q.R,S.V.W,Y; 

S69A.C,D.E,F,G.H,I,K,L.M.N.P.Q.R,T.V,W.Y; N70A,C.D.E,F,G.H.I 1 K,L.M.P.Q.R.S.T.V,W,Y; 
F71A 1 C 1 D ( E.G,HJ,KX.M.NP,Q.R.S,T,V,W,Y;T72A 1 C,D.E,F.G.H,I,K.L.M 1 N,P,Q.R,S,V,W,Y; 
L73A,CAEp.G,HJ,K,M,N l P,Q,R.S,T,V f W,Y;T74A,C,D,E.F.G.H,l,K,L,M 1 N,P.Q.R,S,V,W.Y; 

30 N75A,C,D 1 E.F,G.H,I.K,L.M 1 P.Q,R.S,T,V.W,Y; L76C; 

S78A,C.D,Ep,G,HJ,K,L,M > NP,QPJ.V,W,Y;R79A,C.D,EP,G,HJ 1 KX.M,^P.Q.S,T,V.W 1 Y; 

Y80A.C,D.E,F.G,H,I.K.L,M,N.P.Q.R,S.T,V.W; N81A.C.D.E.F,G.H,I,K 1 L.M.P,.Q,R.S,T.V.W 1 Y; 
T82A,C.O,E.F I G,H,l 1 K,L,M.N.P.Q,R,S,V ? W,Y (preferably T82S); j/j 
G83A,C,D 1 E,F,H.I,K,L.M,N,P,Q.R.S.T,V^W > Y; G84A,C,D.E,F,H,I,K,L,M,N.R,,Q,R.S.T > V,W,Y; 
35 Y85A ? C,0,Ep,G I HJ,K,L 1 M > NP,QP.SJ^V > W;A86CA,E^.G,HJ.K,L.M,N,p ? p,R,S,T.V,yV,Y 
(preferably A86Q); T87A,C,D,E,F,G r H,l,K.L,M,N,P,Q,R,S.V,W,Y (preferabjy(T87S); 
V88A,C,D,EP.G,HJ,K,L 1 M,NPAR,S,T.W,Y;A89C.D.E,F,G.H.I,K,L.M.N.P,p,R,S,T,V,W,Y 
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(preferably A89T.S); G90A,C.D,E,F,H.I,K,L,M 1 N,P,Q,R,S.T.V,W,Y; H91T.S; N92P,R,K,S; 
Q93P; A94C.P; P95A.E.D; I96AE.D.P; G97R.K; S98P; S99A,Q,R,K; V100I; S103C; S105C.P: 
T106C; C111R,K; T113E.D; 1114V; G118N.R.K; S120E.DJ: S122R.K; P124R.K; E125P.Q; 
T127R.K; T129E.D.Q.Y; N130E.D.S; M131L; T134C; T135N.P; V136P; E139C; P140E.D; 
5 G141C; G144C; G145C; S146C; Y147F.W; I148C; S149C; G150E.D; N15'lP.E.D,T; G155C; 
V156C;G160A > C,D,E,F,H,1,K,UM,N,P,Q,R,S,T.V,W,Y; | 
S161A,C,D,E.F,G,HJ l K.L,M,NPARjXW l Y;G162AAD,E,F I HJ,K,L,k4 I P.Q.R.SJ,V,W.Y; 
N163A,C,D,E,F 1 G,HJ 1 K,L,MPAR.SJ>.W,Y;C164A ( D,E,F 1 G,HJ.K^,M.n\p i Q.R.SJ.V,W,Y; 

R165A,C,D.E 1 F 1 G,H,J,K 1 L,M.N,P,Q,S.T 1 V,W.Y (preferably R165S); 
10 T166A,C.D.E 1 F.G.H,I,K.L,M,N,P.Q,R,S,V,W.Y (preferably T166F.V); ':'! 

G167A,C,D.E,F.H f I.K > L.M.N,P.Q.R,S.T l V,W,Y;G168A.C.D.E,F,H.I,K.L.M.N.P.Q.R.S,T.V.W > Y; 

T169A I C,D,E > F,G,H,I,K,L,M,N,P.Q,R > S 1 V,W,Y;T170A.C.D,E 1 F.G.H.I,K.L.M 1 N,P,Q.R.S,V,W,Y; 

F171Y; Y172C; Q173C; E174P; V175P; T176N.P; V179I.L; N180R.K.S; S181R.K; V184L.P; 

R185T; L186I; R187P; and/or T1 B8R.K; preferably 
15 T10Y, A25S. R38T. Q42P. Q49K, Q54R, A62S. T82S, S99A. G118N, S120T, S122R, E125Q. 

T129Y. N130S. M131L, N151T. R165S, T166A, F171Y. T176N, V179L, N180S, V184L, and/or 

R185T. , 

19. The variant of any one of claims A -3, which comprises at least one of the following 
20 substitutions: G6C; L7P; A8C; Y9C; T10E.D.Y; G12E.D; G13E.D.P; S16C; ; |V17C; G18C; 

T22AAD,E,F,G.HJ,KX,M.NP,Q,R.SMWXN23A,C 1 D.E,F,G,H,I 1 K,L.M.P,Q.R,S,T,V,W.Y; 

A24C,D.E,F.G.H.I.K,L,M,N,P,Q,R.S.T.v;w,Y (preferably A24S); I ■ \\ 

A25C.D,E,F.G.H.I.K,L,M,N.P,Q,R 1 S,T,V i W,Y (preferably A25S); £•( 
G26A,C,D,E,F,HJ > K,L.M ( NP.Q.R 1 SJMW.Y;Q27A,C I D,E,F,G,HJ.K 1 L.M 1 ^,P,R.SJ,V.W.Y; 

25 P28A,C,D,E,F,G,H 1 I,K,L,M.N,Q,R,S.T.V,W,Y; T32C; A33C; G37C; R38T; V'39R,K; 
Q42E.O.G.P; V43A,C i D,E,F,G 1 H 1 l,K,L.M.N,P,Q,R 1 S,T,W I Y; 
T44A,C,D.E,F,G,H,l,K,L.M,N,P,Q,R.S.V l W.Y (preferably T44S); 

I45A,C,D,E,F > G,H,K,L.M.N.P.Q.R,S,T,V,W,Y;G46A,C,D,E.F > H.I,K,L,M,N,P,Q,R,S.T.V,W.Y; 
N47A,C.D 1 E,F.G.H,l,K.L.M.P,Q,R,S.T,V,W l Y;G48A,C,D,E,F,H.I,K I L,M,N,P,Q,R.SJ,V,vV.Y; 

30 R49A,C,D,E,F,G,H 1 I,K,L,M,N,P,Q,S,T.V;W,Y (preferably R49Q.T); 

G50A,C,D,E.F,H,I,K,L,M,N,P.Q.R,S,T.V,W 1 Y; V51T; F52C; E53Q; Q54N.R; S55C; I56R.K; 

P58E.D; A62C.S; A63C; R66A,C,D.E,F,G,H,I,K,L.M,N.P.Q.S,T.V,W.Y; 

G67A,C,D I E 1 F,H,I,K,L,M,N.P,Q,R,S,T.V 1 W > Y;T68A,C.D.E,F 1 G.H.I,K,L.M,N 1 P.Q,R,S,V.W,Y; 

S69A.C,D,E,F.G,H,I > K,L,M.N.P 1 Q 1 R,T.V,W,Y; N70A,C 1 D,E.F,G.H,l,K.L.M,PjP,R.S,T,V,vV.Y; 
35 F71A.C,D,E,G,HJ.K,L,M.NPA > R.SJ,V i W,Y;T72A.C,D,E,F > G,H.l l KX.M,N|p,Q,R.SXW 

L73A,C,D,E.F,G,HJ.K,M,NPA.R.ST.V < W I Y;T74A,C.D.E,F,G.HJ.K,L,M.N;P.Q,R,S,V,VV.Y; 

N75A.C,D,E.F.G,H.I.K,L.M.P.Q.R,S,T,V-W,Y; L76C; ; |j 
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S7aA,C,D,E,F,G 1 H,I.K I L.M,N I P,Q,R,T.V,W,Y;R79A.C,D,E.F.G,H,l,K,L,M,N,P I Q,S.T,V,W,Y; 
Y80A 1 C,D.E,F,G,H.I.K,L.M.N.P.Q,R,S.T,V,W; N81A.C.D.E.F,G 1 H 1 I,K,L,M.P.Q.R,S.T,V 1 W,Y; 
T82A,C 1 D,E.F,G,H.I.K.L.M.N.P,Q 1 R 1 S.V,W,Y (preferably TB2S); 

G83A,C,D ( E,F,H,I.K.L,IW 1 N,P,Q.R.S,T,V.,W,Y;G84A,C,D 1 E > F.H,I.K,L.M,N 1 P,Q,R,S 1 T,V,W,Y; 
5 Y85A.C.D.E, F.G.H, I.K. L.M,N.P,Q,R,S.T > V,W; A86C,D,E,F.G.H .I.K.L.M ,N ,P,Q.R,S.T.V,W.Y 
(preferably A86Q); TS/A.C.D.E.F.G.H.UK.L.M.N.P.Q.R.S.V.W.Y (preferably T87S); 
V88A.C,D,E,F I G,HJ.KX.M,NPAR,S.T ; ,W,Y;A89C.D,E,F l G,H,l,K,L,M 1 N,P: ;i p,R,S.T,V,W,Y 
(preferably A89S.T); G90A,C f D I E,F.H 1 l,K,L,M.N,P.Q.R > S t T.V,W.Y; H91T.sijN92P,R.K,S; 
Q93P; A94C.P; P95A.E.D;. I96A.E.D.P; G97R.K} S98P; S99A.Q.R.K; V10QI2S103C; St05C,P; 
10 T106C; C111R f K;T113E,D; 1114V; G118N.R.K; S120E.D.T; S122R.K; P.1&4R.K; E125P.Q; 
T127R.K; T129E.D.Q.Y; N130E.D.S; M131L; T134C; T135N.P; V136P; Et39C; P140E.D; 
G141C; G144C; G145C; S146C; Y147F.W; I148C; S149C; G150E.D; N15;jp.E.D.S.T; : 
G155C; V156C. G160A.C > D 1 E,F.H,I,K.L,M > N > P,0,R.S.T.V.W.Y; 

S161A,C > D,E,F 1 G,HJ,K^ 1 M,N,PA,R.T I V,W,Y;G162A,C,D,E,F,H I I,K,L,M.N,P.Q.R,S.T.V,W,Y; 
15 N163A,C.D,E,F,G,HJ,K,L.M,P,Q,R,S,T,V,W.Y;C164A.D 1 E,F,G,H.I,K.L,M,N,P,Q,R,S,T,V.W.Y; 
R165A.C,D.E.F,G,H,I.K.L.M,N.P.Q,S.T.V.W,Y (preferably R165S); 
T166A,C,D,E,F,G,H,I,K,L,M,N,P.Q,R 1 S,V,W,Y (preferably T166V); 

G167A,C,D t E,F,H.I,K.L,M,N l P,Q,R,S,T.V,W.Y;G168A.C,D.E.F,H,l,K,L.M.N.P,Q.R.S.T.V.W.Y; 
T169A.C,D,E.F f G,HJ.K.L.M.N.P,Q.R,S,y,W l Y;T170A,C.D.E.F,G,H,l,K,L.M.N.P l Q,R.S.V.W.Y: 
20 F171Y; Y172C; Q173C; E174P; V175P; T176N.P; V179I.L; N180R.K.S; S181R.K; V184UP; 
R1 85T; L1 861; R1 87P; and/or T1 88R, K; preferably 

T10Y. A25S, R38T, G42P, T44S. R49K, Q54R, A62S, T82S, S99A, G1 18N, S120T, S122R, 
E125Q, T129Y, N130S, M131L, N151T S R165S, T166A, F171Y. T175N. Vii,79L, N180S, 
V184L, and/or R185T. { t "'. 

25 I , H 

20. The variant of any one of claims .1-3, which comprises at least one .of the following 
substitutions: G6C; L7P; A8C; Y9C; T10E.D.Y; G12E.D; G13E.D.P; S16C(jyi7C; G18C; 
T22A,C,D,E,F 1 G.HJ 1 K t L.M,N,P 1 Q.R.S 1 V,W,Y;N23A,C 1 D,E.F.G.H.I I K.L,M.P ( .Q,R.S 1 T.V.W,Y; 

A24C.D,E,F,G.H.I.K,L,M,N.P,Q.R.S,T i V,W,Y (preferably A24S); 
30 S25A,C 1 O.E.F,G,H,J,K.L,M.N,P,Q.R.T.V i W,Y (preferably S2SA); 

G26A.C,D,E,F,H,I,K,L,M,N,P.Q,R,S.T,V,W,Y;Q27A,C.D.E I F.G.H,I,K.L,M.N,P.R,S,T.V,W,Y; 

P28A,C,D,E,F,G,H,I,K 1 L I M,N,Q,R,S,T.V,W,Y; T32C; A33C; G37C; T38R; V39R.K; 

P42E.D,G,Q;V43A,C.D.E,F.G.H.I,K.L.M,N,P,Q,R.S.T 1 W,Y; 

S44A,C.D,E,F.G,H 1 I,K.L,M,N,P,Q,R,T,V,W.Y (preferably S44T); 
35 l45A.C,D,E > F f G.H,K I L,M.N,PAR.S,T,V 1 W 1 Y;G46A,C,D ) E,F,H,l,K,L,M,N,P.Q,R,S,T.V,W,Y; 

N47A,C,D,E.F > G,H.I,K,L.M.P,Q,R.S.T.V,W,Y;G48A,C,D.E.F,H,I.K,L.M > N.P,Q,R,S.T,V,W,Y; 

Q49A,C,D,E,F,G,H I »,K,L,M,N 1 P,R,S,T,V,W,Y (preferably Q49R.T); :. , 
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G50A.C.D,E.F.H > I,K,L ) M,N.P,Q.R.S.T.V.W,Y- 1 V51T; F52C; E53Q; R54N.Q; S55C; V56I.R.K; 
P58E.D; S62A.C; A63C; R66A,C,D,E,F,G,H,I,K.L.M,N,P.Q.S.T,V,W,Y; 
G67A.C ( D,E,F,H.I,K,L,M.N,P,Q.R.S.T.V.W.Y;T68A.C,D.E.F,G.H,I,K,L 1 M.N.P.Q.R.S.V.W.Y: 
S69A.C.D,E,F,G.H.I,K,L,M,N,P.Q,R,T.V.W,Y; N70A,C,D.E.F,G,H,I,K,L,M.P.Q,R,S,T.V.W,Y; 
5 F71A,C,D,E,G,H,I,K,L,M,N,P,Q,R,S,T,V.W,Y;T72A.C,D,E,F,G,H I I,K,L,M,N.P,Q,R,S,V,W.Y; 
L73A,C,D,E.F 1 G,H.I,K 1 M,N,P > Q.R.S.T,V,W,Y;T74A,C,D.E,F,G,H,I,K,L,M.N.P,Q.R,S,V 1 W.Y; 
N75A,C,D,E,F,G 1 H,I,K.L,M,P.Q ( R,S,T.V,W,Y; L76C; 

S78A,C,D,E.F,G.H,l f K > L,M,N,P.Q.R.T,V.W,Y;R79A,C,0,E.F.G,H,l,K.L.M f N,P,Q,S.T.V.W.Y; 
Y80A,C 1 D,E,F,G,H,I,K,L,M,N,P,Q,R.S,T.V,W; N81A,C,D,E,F,G.H.I,K.L.M.P.Q.R,S 1 T.V.W.Y; 

* ♦ 

10 T82A.C,D,E,F,G.H.I.K.L.M.N.P.Q,R.S.V.W.Y (preferably T82S); i)\ 

G83A,C I D,E.F,H,I.K.L.M.N.P,Q.R.S.T.V,W,Y; G84A,C.D,E.F,H,l,K.L.M.N.P 4 .p,R.S.T.V.W,Y; 
Y85A,C,D,E.F,G 1 H,I ( K,UM.N,P,Q.R,S,T,V > W;A86C,D,E,F.G,H.I ( K,L,M,N,?..R.Q.S.T.V,W.Y 

(preferably A86Q); T87A.C,D.E,F,G.H,l t K,L.M.N,P.Q.R,S,V,W,Y (preferab!y-T87S); ; 

V88A.C 1 D,E,F,G,H,I,K,L,M,N,P,Q,R,S.T,W,Y; S89A,C.D,E,F.G.H.I,K,L.M.^;P.O.R.T.V,W.Y 

15 (preferably S89A.T); G90A.C,D 1 E,F.H,l t K.L,M 1 N,P,Q,R,S,T.V,W,Y; S91 H,T;'S92P.R.K,N; 

Q93P; A94C.P; A95E.D.P; I96A,E,D,P; G97R.K; S98P; Q99A.R.K.S; I100V; S103C; S105C.P; 

T106C; C11 1R.K; T113E.D; V114I; G118N.R.K; T120E.O.S; S122R.K; P124R.K; Q125E.P; 

T127R.K; Q129E.D.YJ; N130E.D.S; L131M; T134C; N135P.T; V13BP; E139C; P140E.D; 

G141C; G144C; G145C; S146C; F147W.Y; I148C; S149C; G150E.D; S151P,E,D,T; G155C; 

20 V156C; G160A.C.D.E,F,H,I,K,L.M,N,P.Q.R.S,T.V.W,Y; 

S161A,CAE,F 1 G.H,I,K,L 1 M,N,P 1 Q,R,T,V,W 1 Y;G162A.C,D,E,F,H,I.K.L,M I N.P.Q,R.S,T,V.W,Y; 

N163A,C.D 1 E,F,G,H,I,K,L 1 M,P,Q.R.S 1 T,V,W 1 Y; C164A I D,E,F,G,H,I,K,L.M 1 N.P,Q,R.S.T.V,W,Y; 
S1 65A.C.D,E.F.G,H.I,K,L.M,N,P.Q,R,T,V.W,Y (preferably S165R); 
F166A I C.D,E,G.H.l.K.L.M,N,P,Q.R,S,T,y,W,Y (preferably F166.T.V); . , 
25 G167A,C,D,E,F,HJ l KX.M,NP,Q.R.ST,y.W,Y;G168A,C,D,E,F,HJ,K,L f M,N;P.Q,R.S,T,V,W,Y; 
T169A.C,D > E,F,G.HJ,KX,M I N,P,Q,R.S.y,W t Y;T170A l C,D.E,F.G,HJ.K,L.^N,P,Q,R.S.y,W,Y; 

Y171F; Y172C; Q173C; E174P; V175P; T176N.P; L179I.V; S180R.K.N; ST81R.K; L184P.V; 
T185R; L186I; R187P; and/or T188R.K; preferably . ,] \ 

T10Y, Q49K, V56I, T82S. S89A. S91H. S92N. A95P, Q99A. 1100V, V114l f |G118N, S122R. 
30 Q129Y, N130S, N135T. F147Y, S1 51 T,; and/or F166A. 



21 . The variant of any one of claims 1-16 and 1 8-20 which comprises at least one of the 
following substitutions: T10Y. A24S, V51T, E53Q, T82S. A86Q, T87S, I96A, G118N, S122R. 
N130S, L186I. 

22. The variant of any one of claims 1-16 and 18-19 which comprises at least one of the 
following substitutions: R38T; Q42G.P; R49T.Q; Q54N.R; A89S.T; H91S.T; N92S; S99A.Q; 

« » 
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A120T; E125Q; T129Y.Q; M131L; T135N; Y147F; N151S; R165S; T166V.F; F171Y; V179I.L: 
preferably at least one of the following substitutions: R36T; N92S; A1 20T; E'l 25Q; M1 31 L; 
T135N; Y147F; N151S; R165S; anoVor F171Y. 

r 

5 23. The variant of any one of claims. 1-19 which comprises at least one of the following 
substitutions: A25S, T44S, A62S, P95A. V1001, 11 14V, T176N, N180S, V184L. R185T. 

24. The variant of any one of claims 1-23 which is not identical to any one of the following 
amino acid sequences: SEQ ID NO: 2, amino acids 1-188 of SEQ ID NO: 4, amino acids 1- 

10 188 of SEQ ID NO: 6 f amino acids 1-188 of SEQ ID NO: 8, and amino acids 1-188 of SEQ ID 
NO: 1 0. 

■ 

25. The variant of any one of claims 1-24 which is not identical to th& protease derived 
from Nocardiopsis dassonviltei NRRL 18133, provided the latter has at least 60% identity to 

15 SEQ ID NO: 2. : V 

< 

26. The variant of any one of claims 1-25 which is not identical to the protease derived 
from Nocardiopsis sp. FERM P-1 0508,; and/or Nocardiopsis dassonviltei strain ZIMET 43647 
provided the respective protease has at least 60% identity to SEQ ID NO: 2. 

20 

27. The variant of any one of claims 1-26 which is not identical to any protease that may 
belong to the prior art and which has at least 60% identity to SEQ ID NO: 2. 

28. The variant of any one of claims 1-27 which has amended properties, such as an 

* • 

25 improved thermostability and/or a higher or lower optimum temperature. 

29. The variant of claim 28 which has a Tm of at least 83.1 °C as measured by DSC in 
10mM sodium phosphate, 50 mM sodium chloride, pH 7.0. : r ; 

30 30. The variant of any one of claims 1-29 which derives from a strain of the genus 

Nocardiopsis. \ : . , 

• ,» 

\vi 

31. The variant of claim 30 which derives from a strain of Nocardiopsis \alba. Nocardiopsis 
antarctica, Nocardiopsis prasina, Nocardiopsis composta, Nocardiopsis dassonviltei, 
35 Nocardiopsis exhalans, Nocardiopsis halophila, Nocardiopsis halotolerans, Nocardiopsis 
kunsanensis, Nocardiopsis li$teri t Nocardiopsis lucentensis, Nocardiopsis metallicus, 
Nocardiopsis sp., Nocardiopsis synnemataformans, Nocardiopsis trehalosi, Nocardiopsis 
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tropica, Nocardiopsis umidischolae, or Nocardiopsis xinjiangensis. ; 

32. The variant of claim 31 which derives from Nocardiopsis 'alba DSM 15647, 
Nocardiopsis dassonvillei NRRL 18133, Nocardiopsis dassonvillei subsij,.'; dassonvillei DSM 
5 43235, Nocardiopsis prasina DSM 15646, Nocardiopsis prasJna DSM 1564$. Nocardiopsis sp. 
NRRL 18262, Nocardiopsis dassonvillei strain ZIMET 43647, or Nocardiopsis sp. FERM P- 
10508. 



33. A method for generating a protease variant of an improved property, the method 
10 comprising the following steps: 

(a) selecting a parent protease of at least 60% identity to SEQ ID NO: 2; 

(b) establishing a 3D structure of the parent protease by homology modelling using 
the Fig. 2 structure as a model; and/or aligning the parent protease according to the alignment 
of Fig. 1 ; 

■ 

15 (c) proposing at least one amino acid substitution, e.g. by: 

(i) subjecting the 3D structure of (b) to MD simulations at increased 
temperatures, and identifying regions in the amino acid sequ&nce of the parent 
protease of high mobility (isotropic fluctuations): 1 , 

(ii) introducing disulfid bridges by way of cysteine substitutions (C-C}; 
20 (Hi) introducing proline substitutions (P); \h\ 

(ivj replacing exposed neutral amino acid residues with negatively charged 
amino acid residues (E,D); 

(v) replacing exposed neutral amino aicd residues with positively charged 
amino acid residues (R,K); 
25 (vi) replacing small amino acid residues inside the protein with bulkier amino 

acid residues (W); 

(vii) comparing by homology alignment and/or homology modelling 
according to step (c)(i) at least two related parent proteases and transferring amino 
acid residue differences inbetween these protease backbones, preferably from a 
30 backbone having the improved property to a backbone not having this improved 

property; ; \\" i 

(d) preparing a DNA sequence encoding the parent protease fiifr for inclusion af a 
DNA codon of the at least one amino! acid substitution proposed in steps (c>(ii)-(cj(vii), or 
subjecting the parent DNA sequence to random mutagenesis, targetting-at least one of the 

5'., * 

35 regions identified in step (c)(i); ; : ■ j 

(e) expressing the DNA sequence obtained in step (d) in a host'cell, and 

(h) selecting a host ceil expressing a protease variant with an improved property. 
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34, A method of preparing a protease variant, the method comprising the steps of 

(a) cultivating the host cell of claim 33(h) to produce a supernatant comprising the variant; and 

(b) recovering the variant . 

i ■ h\ 

35, An isolated nucleic acid sequence comprising a nucieic acid sequence which encodes 

the protease variant of any of claims 1-32, or the protease variant obtainable according to 
claim 34. 

'. ; • ; 

■ j 

10 36. The nucleic acid sequence of claim 35 which is not identical to any one of the following 
nucleic acid sequences: Nucleotides 900-1466 of SEQ ID NO: 1. nucleotides 499-1062 of 
SEQ ID NO: 3, nucleotides 496-1059 of SEQ ID NO: 5, nucleotides 496-1059 of SEQ ID NO: 
7, and nucleotides 502-1065 of SEQ ID NO: 9* 



15 37. The nucleic acid sequence of claim 35 which is not identical to the nucleic acid 
sequence encoding the mature peptide part of the protease derived from Nocardiopsis 

dassonvillef NRRL 18133, provided this protease has at least 60% identity tp SEQ ID NO: 2. 

ii-.,' 

38. The nucleic add sequence of claim 35 which is not identical to the nucleic acid 
20 sequence encoding the mature peptide part of the proteases derived from Nocardiopsis $p. 

* • 

FERM P-10508, and/or Nocardiopsis dassonvillei strain ZIMET 43647. provided the respective 
protease has at least 60% identity to SEQ ID NO: 2. 

• i 

, «. 
• . ■ 

I '. 

39. A nucleic acid construct comprising the nucleic acid sequence of any one of claims 35- 
25 38 operably linked to one or more control sequences that direct the production of the protease 

variant in a suitable expression host. 

40. A recombinant expression vector comprising the nucleic acid construct of claim 39. 

30 41. A recombinant host cell comprising the nucleic acid construct of claim 39 and/or the 
expression vector of claim 40. 

42. A method for producing the protease variant of any one of claims 1-32, or the variant 
obtainable according to claim 34, the method comprising: .j:. ; 

35 (a) cultivating the host cell of claim 41 to produce a supernatant comprisfipg the variant; and 
(b) recovering the variant. - jj j 

• If; 
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43. A transgenic plant, or plant part, capable of expressing a protease variant of any one of 
claims 1-32, and/or a protease variant obtainable according to claim 34. 

44. A transgenic, non-human animal, or products, or elements thereof, being capable of 
5 expressing a protease variant of any one of claims 1-32, and/or a proteasp : variant obtainable 

* i • 1 

_ ♦ 

according to claim 34. 

45. A composition comprising at lebst one protease variant of any She of claims 1-32, 
and/or a protease variant obtainable according to claim 34, and }j \ 

10 (a) at least one fat soluble vitamin; 

(b) at least one water soluble vitamin; and/or 

(c) at least one trace mineral. 

46. The composition of claim 45 further comprising at least one enzyme selected from the 
15 following group of enzymes: alpha-amylases, galactanases, alpha-galactosldases, xylanases. 

endoglucanases, endo-1 , 3(4)-beta-glucanases, and phytases. 

47. The composition of any one of claims 45-46 which is an animal feed additive. 

20 48. An animal feed composition having a crude protein content of {SO to 800 g/kg and 

comprising the protease variant of any one of claims 1-32, and/or the protease variant 

• .. , 

obtainable according to claim 34, and/or the composition of any one of claims 45-47. 

s ^ 

49. A method for Improving the nutritional value of an animal feed, Wherein the protease 
25 variant of any one of claims 1-32, and/or the protease variant obtainable according to claim 

*> '* 

34. and/or the composition of any one of claims 45-48 is added to the feed. 

9 

\ 

50. A method for the treatment of vegetable proteins, comprising the step of adding the 
protease variant of any one of claims 1-32 t and/or the protease variant obtainable according to 

30 claim 34, and/or the composition of any one of claims 45-48 to at least one vegetable protein 
or protein source. 

51. Use of the protease variant of any one of claims 1-32, and/or the protease variant 
obtainable according to claim 34, and/pr the composition of any one of claims 45-48 (i) in 

35 animal feed; (ii) in the preparation of animal feed; (iii) for improving the nutritional value of 
animal feed; and/or (iv) for the treatment of vegetable proteins. ii; 
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52, Use of the protease variant of, any one of claims 1-32, and/or the protease variant 
obtainable according to claim 34 in detergents. 



■ 

i 

« . m 

: • *■ 
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Abstract 

The invention relates to variants of a parent protease homologous to Nocardiopsis 
proteases, In particular variants of improved thermostability and/or with an amended 
temperature activity profile, e.g. a temperature optimum of 80°C instead of 70°C. The 
invention also relates to DNA sequences encoding such variants, their production in a 
recombinant host cell, as well as methods of using the variants, in particular within the field of 
animal feed and detergents. The invention furthermore relates to methods of generating and 
preparing protease variants of amended properties. 
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1 50 
ADIIGGLAVTMGGRCSVGFAATNAAGQPGFVTAGHCGRVGTQVT1GNGRG 

AD 1 1 GGUAY YMGGRCS VGFAATNS AGQPGF VTAGHCGTVGTG VT I GNGTG 

ADI IGGLA YTMGGRCS VGFAATNAAGQPGFVTAGHCGRVGTQ VS IGNGQG 

ADI IGGLAYTMGGRCS VGFAATMAAGQ POFVTAGH CGRVGTQVT IGNGRG 

ADI I GGIiAYTMGGRCS VGFAATNASGQ PGFVTAGHCGTVGTPVS IGNGQG 

AD I IGGLAY YMGGRCS VGFAATNASGQ PGFVTAGHCGTVGTPVS I GNGKG 
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VFEQSVFPGNDAAFVRGTSNFTLTNLVSRYOTGGYATVAGHNQAPIGSSV 
TFQNSVFPGNDAAFVRGTSNFTLTNLVSRYNSGGYQSVTGTSQAPAGSAV 
VFEQSIFPGNDAAFVRGTSNFTLTNLVSRYNTGGYATVAGHNQAPIGSSV 
VFEQSlFPGNDAAFVRGTSNFTLTNIiVSRYOTGGYATVAGHNQAPIGSSV 
VFERSVFPGNDSAFVRGTSNFTLTNLVSRYNTGGYATVSGSSQAAIGSQI 
VFERSIFP<3NDSAFVRGTSNFTI,TNI*VSRYMSGGYATVAGHN<3APIGSAV 

* 

101 ISO 
CRSGSTTGWHCGTIQARGQSVSYPEGTVTK^4TRTTVCAEPGDSGGSYISG 
CRSGSTTGWHCGTIQARNQTVRYPQGTVYSIjTRTNVCAEPGDSGGSFISG 
CRSGSTTGWHCGTIQARGQSVSYPEGTVTNMTRTTVCAEPGDSGGsYISG 

crsgsttgwhcgtiqargqsvsypegtvtnmtrttvcaepgdsggsyisg 
crsgsttgwhcxstvqargqtvsypqgtvqnltrtnvcaepgdsggsfisg 
crsgsttgwhogt iqarnqtvryfqgtvysi/trttvcae pgdsgg^y I SG 

151 18B 

tqaqgvtsggsgncrtggttf yqevtpmvnswgvrlrt 
sqaqgvtsggsgncs vggtty yqevtpminswgvri rt 
nqaqgvtsggsgncrtggttfyqevtpmvnswgvrlrt 
nqaqgvtsggsgncrtggttfyqevtpmvnswgvrlrt 
soaqgvtsggsgncsfggttyyoevmpmlsswgltlrt 
tqaqgvtsggsgncsaggttyyqevnpmlsswglitiirt 
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00 
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53.477 
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00 


46 
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00 


29 


.26 


51.921 
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1. 


00 
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48.680 
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ATOM 


1314 


CA 


VAX. 


184 


« 
• 

-31.288 


37.397 


ATOM 


1315 


c 


v m fti •> 

VAL 


184 


-32.431 


37 .647 


ATOM 


1316 


0 


w • m ft* 

VAL 


184 


-33 .619 


37.490 


ATOM 


1317 


N 


ARG 


185 


-32.041 


37 .991 


ATOM 


1316 


NH2 


ARG 


IBS 


-31.382 


44 .754 


ATOM 


1319 


NH1 


ARG 


IBS 


-30/633 


44.370 


ATOM 


1320 


CZ 


ARG 


185 


-31.320 


43.982 


ATOM 


1321 


HE 


ARG 


185 


-31.986 


42 . 816 


ATOM 


1322 


CD 


ARG 


185 


-31.978 

• 


41.937 


ATOM 


1323 


CG 


ARG 


185 


-32 .959 


40.840 


ATOM 


1324 


CB 


ARG 


165 


-32. [789 


39.732 


ATOM 


1325 


CA 


ARG 


185 


-32. £80 


38.274 


ATOM 


1326 


C 


ARG 


185 


-32.746 


37.319 


ATOM 


1327 


0 


ARG 


185 


-31.721 


37.472 


ATOM 


1328 


N 


LEU 


186 


-33,644 


36.370 


ATOM 


1329 


CA 


LEU 


186 


-33 .463 


35.447 


m MIA « « 

ATOM 


1330 


C 


LEU 


186 


-33.503 


36.225 


«. mAt a 

ATOM 


1331 


0 


LEU 


186 


-34.316 


37.132 


m ft * 

ATOM 


1332 


CB 


LEU 


186 


•34 .648 


34 .435 


ATOM 


1333 


CG 


LEU 


186 


-34 .760 


33.S49 


ft* — - ■ m 

ATOM 


1334 


CD! 


LEU 


186 


-35.699 


32.375 


fti rrt « ■ 
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1335 
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32.928 


vl ft m 
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-29.232 
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ATOM 
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38.474 


ATOM 
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-31.199 


37.750 


ATOM 


1342 
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187 


-31.519 


36.300 


ATOM 


1343 


CB 
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-31.340 


35.B89 


ATOM 


1344 


CA 
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187 


-32.653 


36.298 


ATOM 


134S 


C 


ARG 
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-33.901 


35.672 


ATOM 
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187 
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34.427 
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THR 


188 


-34.769 


36.530 
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CA 


THR 


188 


-35.996 


36 .175 


ATOM 


1349 


C 


THR 
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-35.889 
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THR 
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THR 


188 
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38.057 


ATOM 


1353 


CG2 


THR 


188 


-37.561 


36.118 


ATOM 


1354 


OXT 


THR 


188 


-36.&51 


36.451 
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1Q508-010-DK.ST25 
SEQUENCE LISTING 



Patent* og 



<110> 

<120> 

<130> 

<160> 

<170> 

<210> 
<211> 
<212> 
<213> 



<220> 
<22l> 
<222> 



Novozymes a/S 
Protease variants 
10508 . 010-DK 
12 

Patent in version 3.2 



1 

1596 
DNA 

Nocardiopsis sp. 



CDS 

(900) . . (1466) 



nrrl 18262 ("Protease 10°) 



<400> 1 

acgtttggta cgggtaccgg tgtccgcatg tggccagaat 

■ 

ggattcggtc ggtagcgcat cgactccgac aaccgcgagg 
ttctgcgacc gtcatgcgac ccatcatcgg gtgaccccac 
gttctgacgg tctttccctc accaaaacgt gcacctatgg 
tgtctcggtg aacgacaggg gccggacggt attcggcccc 
aggagagtag ggaccccatg cgaccctccc ccgttgtctc 
tggcettcgg tctggcgctg tccggtaccc cgggtgccct 
cccagtcacc caccccggag gccgacgcgg tctccatgca 
tcgacctgac ctccgccgag gccgaggagc tgctggccgc 
tcgacgaggc cgcggccgag gccgccgggg acgcctacgg 
agagcctgga actgaccgtc ctggtcaccg atgccgccgc 
ccggcgccgg gaccgagctg gtctcctacg gcatcgacgg 

« 

agctcaacgc cgccgacgcc gttcccggtg tggtcggctg 
acaccgtcgt cctggaggtc ctggagggtt ccggagccga 
acgccggcgt ggacgcctcg gccgtcgagg tgaccacgag 



gcccccttgc 
tggccgttcg 
cgagctctga 
ttaggacgtt 
gatcccccgt 
cgccatcggt 
cgcggccacc 
ggaggcgctc 
ccaggacacc 
cggctccgtc 
ggtcgaggcc 



gacagggaac 
cgtcgccacg 
atggtccacc 
gtttaccgaa 
tgatcccccc 
acgggagcgc 
ggagcgctcc 
cagcgcgacc 
gccttcgagg 
ttcgacaccg 
gtggaggcca 



tctcgacgag atcgtccagg 
gtacccggac gtggcgggtg 
cgtcagcggc ctgctcgcgg 
cgaccagccc gagctctac 



Ala Asp He He Gly Gly Leu Ala Tyr Thr Met 
1 5 10 



gcc gac ate ate got got ctg gec tac acc at9 gac ggc cgc tgt teg 

- 6 j LGu Ala Thr Gly Gl 



Gly Gly Arg C^s Ser 



gtc ggc ttc geg gcc acc aac gcc gee ggt cag ccc ggg ttc gtc acc 
val GTy Phe Ala Ala Thr Asn Ala Ala Giy Gin Pro Gly Phe Val Thr 

20 25 30 

gcc ggt cac tgc ggc cgc gtg ggc acc cag gtg acc ate ggc aac ggc 
Ala Gly His Cys Gly Arg vaT Giy Thr Gin val Thr lie Gly Asn Gly 

)C Af\ >1C 



a 99 ggc gtc ttc gag cag tec gtc ttc ccc ggc aac gac gcg gcc ttc 
Arg Gly val Phe Glu Gin ser val Phe Pro Gly Asn Asp Ala Ala Phe 
50 55 60 

Page: 1 
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10S08-01O-DK.ST25 

gtc cgc ggt acg tec aac ttc acg ctg acc aac ctg gtc age cgc tac 1139 

yal Arg Gly Thr Ser Asn Phe Thr Leu Tbr Asn Leu val Ser Arg Tyr 
65 70 75 SO 

aac acc ggc gqg tac gec acg gtc gec ggt cac aac cag gec ccc ate 1187 
Asn Thr Gly Gly Tyr Ala Thr val Ala' Gly His Asn Gin Ala pro lie 

85 : 90 95 

ggc tec tec gtc tgc cgc tec ggc tec acc acc ggt tgg cac tgc ggc 1235 
Gly ser ser val cys Arg ser Gly ser Thr Thr Giy Trp His Cys Gly 

100 * 105» 110 

acc ate cag gec cgc ggc cag teg gtg age tac ccc gag ggc acc gtc 
Thr lie Gin Ala Arg Gly Gin Ser vaT Ser Tyr Pro Glu Gly Thr val . 

115 120 125 : ! 

acc aac atg acc egg acc acc gtg tgc : gcc gag ccc ggc gac tec ggc*) 1331 
Thr Asn Met Thr Arg Thr Thr val Cys Ala Glu Pro Gly Asp Ser Gly : 
130 135 140 

gac tec tac ate tec ggc acc cag gec cag ggc gtg acc tec ggc ggc' i 1379 
Gly Ser Tyr lie Ser Gly Thr Gin Ala Gin Gly val Thr ser Gly Gly : 
145 150 155 160" 



tec ggc aac tgc cgc acc ggc ggg acc acc ttc tac cag gag gtc acc ! 
Ser Gly Asn cys Arg Thr Gly Gly Thr Thr Phe Tyr Gin Glu val Thr 

165 170 175 




Ala Gly His Cys Gly Arg Val Gly Thr Gin val Thr lie Gly Asn Gly 

35 40 45 

Arg Gly val Phe Glu Gin ser val Phe Pro Gly Asn Asp Ala Ala Phe 
50 55 60 

val Arg Gly Thr ser Asn Phe Thr Leu Thr Asn Leu val ser Arg Tyr 
65 70 75 80 

Asn Thr Gly Gly Tyr Ala Thr Val Ala Gly His Asn Gin Ala Pro lie 

85 90 95 

Page 2 




1283 



1427 



ccc atg gtg aac tec tgg ggc gtc cgt etc egg acc tga tccccgcggt 1476 
Pro Met VaT Asn Ser Trp Gly val Arg Leu Arg Thr 

180 185 

tecaggegga ccgacggtcg tgacctgagt accaggcgtc cccgccgctt ccagcggcgt 1536 

ccgcaccggg gtgggaccgg gcgtggccac ggccccacce gtgaceggae cgcccggcta 1596 

<210> 2 

<211> 188 

<212> prt 

<213> Nocardiopsis sp. nrrl 18262 ("Protease 10"} 

<400> 2 '[ 

* * * 

Ala Asp He lie Gly Gly Leu Ala Tyr Thr Met Gly Gly Arg cys ser 

1 5 10 15 ivi 

» « ■ 

■ 

val Gly Phe Ala Ala Thr Asn Ala Ala Gly Gin Pro Gly Phe val Thr 

20 25 . 30 






10508-010-OK.ST25 

■ 

Gly Ser Ser val cys Arg Ser Gly ser Thr Thr Gly Trp His Cys Gly 

100 105 110 

Thr lie Gin Ala Arg Gly Gin ser val Ser Tyr pro Glu Gly Thr val 
115 120 125 

Thr Asn Met Thr Arg Thr Thr val Cys Ala Glu Pro Gly Asp ser Gly 
130 135 140 

Gly ser Tyr lie Ser Gly Thr Gin Ala Gin Gly val Thr Ser Gly Gly 
145 150 155 160 

ser Gly Asn Cys Arg Thr Gly Gly Thr Thr Phe Tyr Gin Glu val Thr 

165 170 175 

Pro Met val Asn ser Trp Gly val Arg Leu Arg Thr 

180 185 

<210> 3 

<211> 1065 

<212> ONA 

<213> Nocardiopsis dassonvillei subspecies, dassonvillei DSM 43235 ("Protease 

18") 
<220> 

<221> COS 

<222> (1)..(1062) 

<220> 

<221> mat_peptide 
<222> C499)..(1062) 

<400> 3 

get ccg gec ccc gtc ccc cag acc ccc gtc gec gac gac age gec 45 

Ala Pro Ala Pro val pro Gin Thr Pro val Ala Asp Asp Ser Ala 
-165 -160 -155 

gec age atg acc gag gcg etc aag cgc gac etc gac etc acc teg 90 
Ala ser Met Thr Glu Ala Leu Lys Arg Asp Leu Asp Leu Thr ser 
-150 -145 -140 

gec gag gec gag gag ctt etc teg gcg cag gaa gee gee ate gag 
Ala Glu Ala Glu Glu Leu Leu Ser Ala Gin Glu Ala Ala lie Glu 
-135 -130 -125 

acc gac gee gag gec acc gag gec gcg ggc gag gec tae ggc ggc 
Thr Asp Ala Glu Ala Thr Glu Ala Ala Gly Glu Ala Tyr Gly Gly 
-120 -115 -HO 

tea ctg ttc gac acc gag ace etc gaa etc acc gtg ctg gtc acc gac 228,. 
ser Leu phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 
-105 -100 -95 

gec tee gee gtc gag gcg gtc gag gee acc gga gec cag gec ace gtc 
Ala Ser Ala val Glu Ala val Glu Ala Thr Gly Ala Gin Ala Thr val 
-90 -85 -80 -75 

gtc tec cac ggc acc gag ggc ctg acc gag gtc gtg gag gac etc aac 324 
val Ser His Gly Thr Glu Gly Leu Thr Glu val val Glu Asp Leu Asn 

-70 -65 -60 

page 3 



135 



180 



276 



to 



10508-Q10-DK.ST25 



goc gcc gag gtt ccc gag age gtc etc ggc tgg tac ccg gac gtg gag- 372 
Gly Ala Glu Val Pro Glu Ser val Leu. Gly Trp Tyr Pro Asp vaT Glu!*: 

-55 -50 -45 

age gac acc gtc gtg gtc gag gtg ctg gag ggc tec gac gcc gac gtc,; 420 
Ser Asp Thr val val val Glu vaT Leu Glu Gly Ser Asp Ala Asp val 
-40 -35 -30 



gcc gcc ctg etc gcc gac gcc ggt gtg gac tec tec teg gtc egg gtg 
Ala Ala Leu Leu Ala Asp Ala Gly val Asp Ser ser Ser val Arg VaT 
-25 -20 -15 

gag gag gcc gag gag gcc ccg cag gtc tac gcc gac ate ate ggc ggc 
Glu Glu Ala Glu Glu Ala Pro Gin val Tyr Ala Asp lie He Gly Gly 
-10 -5 -11 5 

ctg gcc tac tac atg ggc ggc cgc tgc tec gtc ggc ttc gcc gcg acc 
Leu Ala Tyr Tyr Met Gly Gly Arg cys ser Val Gly Phe Ala Ala Thr 

10 15 20 



ttc acc ctg acc aac ctg gtc teg cgc tac aac tec ggc ggc tac cag 
Phe Thr Leu Thr Asn Leu val Ser Arg Tyr Asn Ser Gly Gly Tyr Gln ; 

75 80 85 

teg gtg acc got acc age cag gcc ccg gcc ggc teg gcc gtg tgc cgc 
Ser val Thr Gly Thr Ser Gin Ala Pro Ala Gly Ser Ala VaT cys Arg 

90 95 100 



tec ggc tec acc acc ggc tgg cac tgc ggc acc ate cag gcc cgc 
Ser Gly ser Thr Thr Gly Trp His cys Gly Thr lie Gin Ala Arg Asn 



cag acc gtg cgc tac ccg cag ggc acc gtc tac teg etc acc cgc acc 
Gin Thr val Arg Tyr Pro Gin Gly Thr Val Tyr Ser Leu Thr Arg Thr 
120 ,125 130 

aac gtg tgc gcc gag ccc ggc gac tec ggc ggt teg ttc ate tec ggc 
Asn val cys Ala Glu Pro Gly Asp Ser Giy Gly ser Phe lie Ser Gly 
135 140 145 150 



is 



468 



516 



564 



aac age gcc ggt cag ccc ggt ttc gtc acc gcc ggc cac tgc ggc acc 612 
Asn Ser Ala Gly Gin Pro Gly Phe val Thr Ala Gly His cys Gly Thr 
25 30 35 

gtc ggc acc ggc gtg acc ate ggc aac ggc acc ggc ace ttc cag aac 660 
val Gly Thr Gly val Thr lie Gly Asn Gly Thr Gly Thr Phe Gin Asn r . 
40 45 50 " 

teg gtc ttc ccc ggc aac gac gcc gcc ttc gtc cgc ggc acc tec aac 708 
ser val Phe Pro Gly Asn Asp Ala Ala phe val Arg Gly Thr ser Asn 

55 60 65 70 ir: 



756 



804 



aac 852 

Ala Ara 

105 110 115 



900 



948 



teg cag gcc cag ggc gtc acc tec ggc ggc tec ggc aac tgc tec gtc 996 
ser Gin Ala Gin Giy val Thr ser Gly Gly Ser Gly Asn cys ser val 

155 160 ■ 165 



ggc ggc acg acc tac tac cag gag gtc. acc ccg atg ate aac tec tgg 1044 
Gly Gly Thr Thr Tyr Tyr Gin Glu val Thr Pro Met lie Asn Ser Trp 

170 175 180 

ggt gtc agg ate egg acc taa 1065 
Gly Val Arg lie Arg Thr 
185 

<210> 4 
<211> 354 

Page 4 
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<212> prt 

<|i3> Nocardiopsls dassonvillei subspecies dassonvillei DSM 43235 ("Protease 

<400> 4 *j 

Ala Pro Ala Pro Val Pro Gin Thr Pro Val Ala Asp Asp Ser Ala : 
-165 -160 -iSs 

Ala s ?f^ Met Thr Glu Ala Leu *-ys Arg Asp Leu Asp Leu Thr ser 
-150 -145 -140 

Ala Glu Ala Glu Glu Leu Leu ser Ala Gin Glu Ala Ala lie Glu 
-135 -130 -125 

Thr A ?!L Ala Glu Ala Thr Glu Ala A la Gly Glu Ala Tyr Gly Gly 
-120 -us -no 

« 

Ser L ?H^ phe As P Thr Glu Thr LeLr Gl « Leu Thr val Leu val Thr Asp 
-105 -100 -95 

A lS Ser Ala val Glu Ala val Glu Ala Th r G ly Ala Gin Ala Thr val 
-90 -85 -80 -75 

■ 

Val ser His Gly Thr Glu Gly Leu Thr Glu val val Glu Asp Leu Asn . 

-70 -65 -60 

Gly Ala Glu val Pro Glu ser Val Leu Gly Trp Tyr pro Asp val Glu '* 

-55 -50 -45 

Asp Thr val Val val Glu val Leu Glu Gly ser Asp Ala Asp val . 
-40 -35 * -30 

Ala Ala Leu Leu Ala Asp Ala Gly val Asp ser ser ser val Arg val 
-25 -20 -15 

■ 

Glu Glu Ala Glu Glu Ala Pro Gin val Tyr Ala Asp lie lie Gly Gly 
-10 -5 -11 5 

Leu Ala Tyr Tyr Met Gly Gly Arg Cys Ser val Gly Phe Ala Ala Thr 

10 15 20 

* 

Asn Ser Ala Gly Gin pro Gly Phe val Thr Ala Gly His Cys Gly Thr 
25 30 35 

> i 

i 
> 

Val Gly Thr Gly val Thr lie Gly Asn Gly Thr Gly Thr Phe Gin Asn 
40 45 50 

r- 

4 

Ser val Phe Pro Gly Asn Asp Ala Ala Phe Val Arg Gly Thr ser Asn 
55 60 65 70 a 

■ - 

Phe Thr Leu Thr Asn Leu val Ser Arg Tyr Asn ser Gly Gly Tyr Gin 

75 80 8^ 
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Ser val Thr Gly Thr Ser Gin Ala Pro Ala Gly Ser Ala val Cys Arg 

90 95 100 



Ser Gly Ser Thr Thr Gly Trp His Cys 
105 110 



Gin Thr val Arg Tyr Pro Gin Gly Thr 
120 125 



Gly Thr lie Gin Ala Arg Asn 

115 



val Tyr Ser Leu Thr Arg Thr 
130 



Asn val cys Ala Glu pro Gly Asp Ser Gly Gly Ser Phe lie Ser Gly 
135 140 145 150 

Ser Gin Ala Gin Gly val Thr Ser Gly Gly ser Gly Asn cys ser val 

155 160 165 

Gly Gly Thr Thr Tyr Tyr Gin Glu val Thr Pro Met lie Asn Ser Trp 

170 175 180 



Gly val Arg lie Arg Thr 
185 



<210> 5 

<211> 1062 

<212> DNA 

<213> Nocardiopsis prasina DSM 1S648 ("Protease 11 M ) 



<220> 

<221> COS 

<222> (1). .(1059) 

<220> 

<22l> mat_pepti de 
<222> (496) . • (1059) 

<400> 5 

gcc acc gga ccg etc ccc cag tea ccc ace ccg gag gec gac gec 

Ala Thr Gly Pro Leu Pro Gin Ser Pro Thr Pro Glu Ala Asp Ala 

-165 -160 -155 

gtc tec atg cag gag gcg etc cag cgc gac etc ggc ctg acc ccg 
Val Ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr pro 
-150 -145 -140 

ctt gag gcc gat gaa ctg ctg gcc gcc cag gac acc gcc ttc gag 
Leu Glu Ala Asp Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu 
-135 -130 -125 

gtc gac gag gcc gcg gcc gcg gcc gcc ggg gac gcc tac ggc ggc 
val Asp Glu Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly 



-120 



-115 



-110 



tec gtc ttc gac acc gag acc ctg gaa ctg acc gtc ctg gtc acc gac 
ser val Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 
-105 -100 -95 -90 

gcc gcc teg gtc gag get gtg gag gcc acc ggc gcg ggt acc gaa etc 
Ala Ala ser val Glu Ala val Glu Ala Thr Gly Ala Gly Thr Glu Leu 

-85 -80 -75 
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90 



135 



180 



228 



276 



i 
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gtc tec tac ggc ate gag ggc etc gae gag ate ate 
val Ser Tvr gTv Tie Glu civ lpu asd gIu Tie lie 

-70 -65 


cag 
Gin 

VJ • » 1 


gat 

Acn 
-60 


etc 

Leu 

*m t W 


aac 


324 


9fc gee gac gee gtc ccc ggc gtg gtc ggc tgg tac 
Ala Ala asd Ala val Pro gTv VaT val gTv tpd Tvr 
-55 -50 


ccg 
Pro 
-45 


gac 

ACQ 


gtg 
VaT 


gcg 

Al a 


372 


got gac acc gtc gtc ctg gag gtc ctg- gag ggt tec 

Glv ASD Thr Val Val Leu Glu Val Leu Glti gTv Ser 
-40 -35 -30 


gga 
gTv 


gec 

Al a 


gac 

Acn 


gtg 

val 


420 


age ggc ctg etc gee gac gee ggc gtg gac gee teg 
Ser Gly teu Leu Ala asd Ala Glv val asd Ala ser 
-25 -20 -15 

■ 


gec 

Ala 


gtc 

Val 


gag 

Glu 


gtg 
val 
-10 


468 


acc age agt gcg cag ccc gag etc tac gee gac ate 

Thr Ser Ser Ala Gin Pro Glu Leu Tvr Ala asd lie 

-5 -1 1 


ate 

He 
Ait 


ggc 

gTv 
5 V 


ggt 

gT v 

vs i y 


ctg 

Leu 


516 


gec tac acc atg ggc ggc cgc tgt teg gtc gga ttc 

Ala Tvr Thr Met Giv Glv Aro rvs 5er Val Glv Phe 
10 15 


gcg 

Al a 

av i a 

20 


MIA 


acc 

Thr 
■ 1 1 1 


aac 

ACrt 
rO 1 1 


564 


gec gee ggt cag ccc gga ttc gtc acc gee ggt cac 

Ala Ala Glv Gin Pro Glv phe val Thr Ala Glv Hie 
i ** * « vj i y vj hi n u vj i y flic val iiii mio vj i y n I j 

25 7 30 35 


tgt 

cys 


ggc 

vj i y 


cgc 

A rn 
mi y 


gtg 

val 

Va l 


612 


ggc acc cag gtg age ate ggc aac ggc cag ggc gtc 

Glv Thr Gin val qpp tIo fiiv Acn n i \/ r:ln fiiu val 
m i y iiii vj i j ■ vai jci i ic v«y "in uiy v* i n vj iv vol 

40 45 50 


ttc 

KflC 


gag 

vj 1 U 


cag 

1^1 n 
Vj i n 


tec 

cor 
dci 

55 


660 


ate ttc ccg ggc aac gac gee gec ttc gtc cgc ggc 

XI & Phe Pro Giv Acn Acn Ala Ala Pho Val Am Glv 

60 65 


acg 

Thr 
i sir 


tec 

Co r 


aac 

Acn 

70 


ttc 


708 


acg ctg acc aac ctg gtc age cgc tac aac acc ggc 

Thr Leu Thr Asn Leu Val ser Aro Tvr -Asn Thr Glv 

75 80 


ggt 

gTv 


tac 

Tvr 


gee 

Ala 


acc 

Thr 
■ 1 1 1 


756 


gtc gee ggc cac aac cag gcg ccc ate ggc tec tec 
val Ala GTy His Asn Gin Ala Pro lie Glv ser ser 
90 95 


gtc 

Val 
100 


tgc 

Cvs 

v» jr «j 


cgc 

Ara 


tec 
ser 

«JV» • 


804 


ggc tec acc acc ggc tgg cac tgc ggc acc ate cag 
Glv Ser Thr Thr Glv Tro his Cvs Glv Thr lie Gin 

^* * j >*■• iiii vj w y ii kj n ■ w Vv jr j w i j ■ iiiv <*i^ vj ifi 

10S 110 115 


gec 

Ala 


cgc 

Aro 


ggc 

Glv 

\j i y 


cag 

g! n 

VJ III 


852 


teg gtg age tac ccc gag ggc acc gtc acc aac atg 
Ser vaT ser Tyr pro Glu Gly Thr val Thr Asn Met 
120 125 130 


acc 
Thr 


egg 
Arg 


acc 
Thr 


acc 
Thr 
135 


900 


gtg tgc gec gag ccc ggc gac tec ggc ggc tec tac 
val cys Ala Glu Pro cTy asd Ser GTy GTy Ser Tyr 

140 145 


ate 
He 


tec 
ser 


ggc 

GTy 
150 


aac 
Asn 


948 


cag gec cag ggc gtc acc tec ggc ggc tec ggc aac 
Gin Ala Gin GTy val Thr ser GTy GTy Ser GTy Asn 

155 160 


tgc 

Cys 


cgc 
Arg 
165 


acc 
Thr 


ggc 

Gly 


996 


ggg acc acc ttc tac cag gag gtc acc ccc atg gtg 
GTy Thr Thr Phe Tyr Gin Glu val Thr Pro Met vaT 
170 175 


aac 
Asn 
180 


tec 
Ser 


tgg 

Trp 


ggc 
Gly 


1044 



gtc cgt etc egg acc taa 1062 
val Arg Leu Arg Thr 
185 
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<210> 6 
<211> 353 
<212> PRT 

<213> Nocardiopsis prasina OSM 15648 ("protease 11") 
<400> 6 

Ala Thr Gly pro Leu Pro Gin Ser pro Thr Pro Glu Ala Asp Ala 
-165 -160 -155 

val ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr pro 
-ISO -145 -140 

Leu Glu Ala Asp Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu 
-135 -130 -125 

val Asp Glu Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly 
-120 K -115 -110 

Ser val Phe Asp Thr Glu Thr Leu Glu Leu Thr Val Leu val Thr Asp 
-105 -100 -95 -90 

Ala Ala ser val Glu Ala val Glu Ala Thr Gly Ala Gly Thr Glu Leu 

-85 -80 -75 

val ser Tyr Gly He Glu Gly Leu Asp Glu lie lie Gin Asp Leu Asn 

-70 -65 -60 

♦ 

Ala Ala Asp Ala Val Pro Gly Val val Gly Trp Tyr Pro Asp Val Ala 
-55 -50 -45 

Gly Asp Thr val Val Leu Glu val Leu Glu Gly Ser Gly Ala Asp val 
-40 i -35 i -30 

Ser Gly Leu Leu Ala Asp Ala Gly val Asp Ala ser Ala val Glu val 
-25 -20 -15 -10 

Thr Ser ser Ala Gin Pro Glu Leu Tyr Ala Asp He lie Gly Gly Leu 

-5 -11 5 

Ala Tyr Thr Met Gly Gly Arg Cys Ser val Gly Phe Ala Ala Thr Asn 

10 15 20 :*\ 

r 

Ala Ala Gly Gin Pro Gly Phe val Thr Ala Gly His cys Gly Arg val 
25 30 35 

Gly Thr Gin Val ser He Gly Asn Gly Gin Gly val Phe Glu Gin ser 
40 45 50 55 

lie Phe Pro Gly Asn Asp Ala Ala Phe Val Arg Gly Thr ser Asn Phe 

60 65 70 
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10508-010-DK.ST25 

Thr Leu Thr Asn Leu val Ser Arg Tyr Asn Thr Gly Gly Tyr Ala Thr 

75 80 85 

val Ala Gly His Asn Gin Ala Pro lie Gly ser Ser val Cys Arg Ser 
90 95 100 

Gly Ser Thr Thr Gly Trp His Cys Gly Thr He Gin Ala Arg Gly Gin 
105 no 115 

Ser val ser Tyr Pro Glu Gly Thr val Thr Asn Met Thr Arg Thr Thr 
120 125 130 135 

val Cys Ala Glu Pro Gly Asp Ser Gly Gly Ser Tyr lie Ser Gly Asn 

140 ' 145 150 

Gin Ala Gin Gly val Thr ser Gly Gly ser Gly Asn Cys Arg Thr Gly 

155 * 16D 165 

Gly Thr Thr Phe Tyr Gin Glu val Thr Pro Met val Asn Ser Trp Gly 
170 175 180 

val Arg Leu Arg Thr 
185 

• m 

<210> 7 
<211> 1062 
<212> DNA 

<213> Nocardiopsis prasina dsm 15649 ("Protease 35 M ) 
<220> 

<221> CDS 

<222> (D..C1059) 

<220> 

<221> mat_peptide 
<222> (496).- (1059) 

<400> 7 

|gcc acc gga cca etc ccc cag tea ccc acc ccg gag gec gac gec 45 

A] a Thr Gly pro Leu Pro Gin Ser Pro Thr pro Glu Ala Asp Ala 

-165 -160 -155 

gtc tec atg cag gag gcg etc cag cgc gac etc ggc ctg acc ccg 90 

val ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr Pro 

-150 -145 -140 

ctt gag gec gat gaa ctg ctg gec gec cag gac acc gee ttc gag 135 

Leu Glu Ala Asp Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu 

-135 -130 -125 

gtc gac gag gec gcg gee gag gee gec ggt gac gee tac ggc ggc 180 

val Asp Glu Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr Gly Gly 

-120 -115 -110 

tec gtc ttc gac acc gag acc ctg gaa ctg acc gtc ctg gtc acc gac 228 

ser val Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 

-105 -100 -95 -90 
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Glu Ala val Glu Ala Thr Giy Ala Giy 
-85 -80, 



gtc tec tac ggc ate acg ggc etc gac gag : atc gtc gag gag 
val ser Tyr Giy He Thr Giy Leu Asp Glu lie val Glu Glu 

-70 -65 -60 



-55 -50 -45 

ggt gac acc gtc gtg ctg gag gtc ctg gag ggt tec ggc gc 
Giy Asp Thr Val val Leu Glu val Leu Glu Giy ser Giy Al 
-40 -35 . -30 

ggc gac ctg etc gec gac gec ggc gtg gac gee teg gcg gtc gag gtg 
Giy Giy Leu Leu Ala Asp Ala Giy Val Asp Ala Ser Ala val Glu vaT 
-25 -2b -15 -10 

acc acc acc gag cag ccc gag ctg tac gec gac ate ate ggc 
Thr Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp lie lie Giy 

-5 -11 5 

gee tac acc atg ggc ggc cgc tgt teg gtc ggc ttc gcg gec acc aac 
Ala Tyr Thr Met Giy Giy Arg cys Ser val Giy Phe Ala Ala Thr Asn 
10 15 20 



Ala Giy Gin Pro Giy Phe Val Thr Ala Giy His 
25 * 30 35. 



Giy Thr Gin Val Thr rle Giy Asn Giy Arg Giy 
40 45 50 



60 65 



Asn Leu val Ser Arg Tyr Asn; Thr Giy gTv Tyi 
75 * 80 85 



90 95 100 



105 110 115 

teg gtg age tac ccc gag ggc acc gtc acc aac atg 
Ser val ser Tyr pro Glu Giy Thr val Thr Asn Met 
120 * 12S 130 



140 145 



Gin Ala Gin Giy val Thr ser Giy Giy ser Giy Asn Cys An 

155 160 16? 

ggg acc acc ttc tac cag gag gtc acc ccc atg gtg aac tc< 

Giy Thr Thr Phe Tyr Gin Glu val Thr Pro Met vaT Asn Sei 
170 175 180 

page 10 



gaa 
Glu 

-75 


ctg 
Leu 


276 

t,\ 
.i 


etc 
Leu 


aac 
Asn 


324 

ij 


gtc 

val 


gcg 

Ala 


372 


gac 

ASp 


gtg 

vaT 


420 


gag 

Glu 


gtg 
vaT 

-10 


468 


got 
Giy 


ctg 
Leu 


516 


acc aac 
Thr Asn 


564 


cgc 
Arg 


gtg 

vaT 


'•612 

* 


cag 
Gin 


tec 
Ser 
55 


660 

■ 


aac 
Asn 
70 


ttc 
Phe 


!' 708 

% 
% 


gec 

Ala 


acc 
Thr 


'" 756 


cgc 
Arg 


tec 
Ser 


804 


ggc 

Giy 


cag 
Gin 


852 


acc acc 
Thr Thr 
135 


900 

• 


99C 
Giy 
150 


aac 
Asn 


948 

» 


acc 
Thr 


ggc 

Giy 


996 

»■• 

• 


tgg 
Trp 


ggc 
GTy 


• 
■ 

1044 



10508-010-DK,ST25 

gtc cgt etc egg acc taa 1 
val Arg Leu Arg Thr 
185 

<2X0> 8 

<211> 353 

<212> PRT 

<213> Nocardiopsis prasina DSM 15649 ("Protease 35") 

<400> 8 

Ala Thr Gly Pro Leu Pro Gin ser Pro Thr Pro Glu Ma Asp Ala 
-165 -160 -155 

val Ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr Pro 
-150 -145 -140 

Leu Glu Ala Asp Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu 
-135 -130 -12S 

val Asp Glu Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr Gly Gly 
-120 -115 -110 

Ser val Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 

-105 -100 , -95 -90 

ser Ala Ala val Glu Ala val Glu Ala Thr Gly Ala Gly Thr Glu Leu 

-85 -80 -75 

val ser Tyr Gly lie Thr Gly Leu Asp Glu He val Glu Glu Leu Asn 

-70 -65 -60 

Ala Ala Asp Ala Val Pro Gly val val Gly Trp Tyr pro Asp val Ala 
-55 -50 -45 

Gly Asp Thr val val Leu Glu val Leu Glu Gly Ser Gly Ala Asp val 
-40 -35 '■ -30 

■ 

Gly Gly Leu Leu Ala asp Ala Gly val Asp Ala ser Ala val Glu val 

-25 -20 -15 -10 

Thr Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp lie lie Gly Gly Leu 

-5 -11 5 

Ala Tyr Thr Met Gly Gly Arg cys ser val Gly Phe Ala Ala Thr Asn 
10 15 20 

Ala Ala Gly Gin Pro Gly Phe val Thr Ala Gly His cys Gly Arg val 
25 30 35 



Gly Thr Gin val Thr lie Gly Asn Gly Arg Gly val Phe Glu Gin ser 

40 45 50 55 

i 
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lie Phe Pro Gly Asn Asp Ala Ala Phe val Arg Gly Thr Ser Asn Phe 

60 65 70 

Thr Leu Thr Asn leu val ser Arg Tyr Asn Thr Gly Gly Tyr Ala Thr 

75 80 85 

val Ala Gly His Asn Gin Ala Pro lie Gly ser ser val Cys Arg ser 
90 95 100 

Gly Ser Thr Thr Gly Trp His Cys Gly Thr lie Gin Ala Arg Gly Gin 
105 110 - 115 

ser val Ser Tyr Pro Glu Gly Thr val Thr Asn Met Thr Arg Thr Thr 

120 125 130 135 

val Cys Ala Glu Pro Gly Asp Ser Gly Gly ser Tyr lie ser Gly Asn 

140 145 150 

Gin Ala Gin Gly val Thr Ser Gly Gly Ser Gly Asn Cys Arg Thr Gly 

IS 5 160 165 

Gly Thr Thr Phe Tyr Gin Glu val Thr Pro Met val Asn ser Trp Gly 
170 175 180 

val Aro Leu Arg Thr 
185 

i 

<210> 9 
<211> 1068 

<212> ONA 

<213> Nocardiopsis alba DSM 15647 C'Protease 08") 
<220> 

<221> COS 

<222> CD . . (1065) 

<220> 

<221> raat^peptide 
<222> C502) . . C1065) 

<400> 9 

gcg acc ggc ccc etc ccc cag tec ccc acc ccg gat gaa gec gag 
Ala Thr Gly Pro Leu Pro Gin Ser Pro Thr Pro Asp Glu Ala Glu 
-165 -160 -155 

gec acc acc atg gtc gag gec etc cag cgc gac etc ggc ctg tec 90 
Ala Thr Thr Met val Glu Ala Leu Gin Arg Asp Leu Gly Leu Ser 
-150 -145 -140 

ccc tct cag gec gac gag etc etc gag gcg cag gec gag tec ttc 135 
Pro Ser Gin Ala Asp Glu Leu Leu Glu Ala Gin Ala Glu Ser Phe 



45 



180 



-135 -130 -125 

gag ate gac gag gec gee acc gcg gec gca gee gac tec tac ggc 
Glu He Asp Glu Ala Ala Thr Ala Ala Ala Ala Asp Ser Tyr Gly 
-120 -115 -110 

ggc tec ate ttc gac acc gac age etc acc ctg acc gtc ctg gtc acc 228 
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Gly Ser He Phe Asp Thr Asp Ser Leu Thr Leu Thr val Leu val Thr 
-105 -100 -95 

gac gcc tec gcc gtc gag gcg gtc gag gcc gcc ggc gcc gag gcc aag 
asp Ala Ser Ala val Glu Ala val Glu Ala Ala Giy Ala Glu Ala Lys 
-90 -65 -80 

gtg gtc teg cac ggc atg gag ggc ctg gag gag ate gtc gee gac ctg 
val val Ser His Gly Met Glu Gly Leu Glu Glu lie Val Ala Asp Leu 
-75 -70 -65 -60 

aac gcg gcc gac get cag ccc ggc gtc gtg ggc tgg tac ccc gac ate 

Asn Ala Ala Asp Ala Gin Pro Gly val val Giy Trp Tyr Pro Asp He 

-55 -50 -45 

i 

cac tee gac acg gtc gtc etc gag gtc etc gag ggc tec got gcc gac 

His ser Asp Thr val val Leu Glu val Leu Glu Giy ser Gly Ala Asp 

-40 -35 -30 





gat etc gcc tac acc atg ggt ggg cgc tgc teg gtc ggc ttc gcg gcc 
Gly Leu Ala Tyr Thr Met Gly Gly Arg cys ser val Gly Phe Ala Ala 

10 15 20 

acc aac gcc tec ggc cag ccc ggg ttc gtc acc gcc ggc cac tgc ggc 
Thr Asn Ala ser Gly Gin Pro Giy Phe Val Thr Ala Gly His Cys Gly 

25 30 35 

acc gtc ggc acc ceg gtc age ate ggc aac ggc cag ggc gtc ttc gag 
Thr val Gly Thr Pro val ser lie Gly Asn Gly Gin Gly val phe Glu 
40 45 50 

cgt tec gtc ttc ccc ggc aac §ac tec gee ttc gtc cgc ggc ace teg 
Arg Ser val Phe Pro Gly Asn Asp Ser Ala Phe val Arg Gly Thr ser 
55 60 65 

aac ttc acc ctg acc aac ctg gtc age cgc tac aac acc ggt ggt tac 
Asn Phe Thr Leu Thr Asn Leu val ser Arg Tyr Asn Thr Gly Gly Tyr 
70 75 80 85 

« 

gcg acc gtc tee ggc tec teg cag gcg gcg ate ggc teg cag ate tgc 

Ala Thr Val Ser Gly Ser Ser Gin Ala Ala lie Gly ser Gin lie Cys 

90 95 100 

cgt tec ggc tec acc acc ggc tgg cac tgc ggc acc gtc cag gcc cgc 
Arg Ser Gly ser Thr Thr Gly Trp His Cys Gly Thr val Gin Ala Arg 

105 110 115 

ggc cag acg gtg age tac ccc cag ggc acc gtg cag aac ctg acc cgc 
Gly Gin Thr val Ser Tyr Pro Gin Gly Thr val Gin Asn Leu Thr Arg 
120 125 * 130 

acc aac gtc tgc gcc gag ccc ggt gac tec ggc ggc tec ttc ate tec 
Thr Asn Val cys Ala Glu Pro Gly Asp Ser Gly Giy ser Phe lie Ser 
135 140 145 

ggc age cag gcc cag ggc gtc acc tee ggt ggc tec ggc aac tgc tec 
Gly Ser Gin Ala Gin Gly val Thr ser Gly Gly ser Gly Asn Cys ser 
150 155 160 165 




276 



324 



372 



420 



gtg gac tec ctg etc gcc gac gcc ggt gtg gac acc gcc gac gtc aag 468 
Val Asp Ser Leu Leu Ala Asp Ala Gly val Asp Thr Ala Asp val Lys 
^25 -20 -15 

gtg gag age acc acc gag cag ccc gag ctg tac gcc gac ate ate ggc 516 
Val Glu Ser Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp lie lie Gly 
-10 -5 -11 5 



564 



612 



660 



708 



756 



804 
8S2 



900 



948 



996 



ttc ggt ggc acc acc tac tac cag gag gtc aac ceg atg ctg age age 1044 
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10508-010-OK.ST25 
Phe Gly Gly Thr Thr Tyr Tyr Gin Glu Val Asn Pro Met Leu Ser Ser 

170 175 180 

*99 ggt ctg acc ctg cgc acc tga 
Trp Gly Leu Thr Leu Arg Thr 

185 

<210> 10 
<211> 355 
<212> PRT 

<213> Nocardiopsis alba OSM 15647 ("protease 08") 
<400> 10 

Ala Thr Gly Pro Leu Pro Gin Ser Pro Thr Pro Asp Glu Ala Glu 
-165 -160 -155 

Ala Thr Thr Met val Glu Ala Leu Gin Arg Asp Leu Gly Leu Ser \ 
-150 -145 -140 

Pro Ser Gin Ala Asp Glu Leu Leu Glu Ala Gin Ala Glu Ser Phe , 
-135 -130 -125 ' 

Glu lie asp Glu Ala Ala Thr Ala Ala Ala Ala Asp ser Tyr Gly 
-120 -11S : -110 

Gly Ser lie Phe Asp Thr Asp Ser Leu Thr Leu Thr val Leu val Th 
-105 -100 -95 

Asp Ala Ser Ala val Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys 
•90 -85 -80 

val val ser His Gly Met Glu Gly Leu Glu Glu He val Ala Asp Leu 
-75 -70 -65 -60 

Asn Ala Ala Asp Ala Gin Pro Gly val val Gly Trp Tyr pro Asp lie 

-55 -50 -45 

His Ser Asp Thr val val Leu Glu val Leu Glu Gly Ser Gly Ala Asp 

-40 -35 -30 

Val Asp Ser Leu Leu Ala Asp Ala Gly val Asp Thr Ala asp val Lys, 
-25 -20 -15 

val Glu ser Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp lie He Gly > 
-10 -5 -11 5 

Gly Leu Ala Tyr Thr Met Gly Gly Arg cys ser val Gly Phe Ala Ala 

10 ; 15 20 

Thr Asn Ala ser Gly Gin Pro Gly Phe Val Thr Ala Gly His Cys Gly 

25 30 35 

Thr Val Gly Thr pro val ser He Gly Asn Gly Gin Gly val Phe Glu 
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10508-010-DK.ST25 
40 45 50 

Arg ser val Phe Pro Gly Asn Asp ser Ala Phe val Arg Gly Thr ser 
55 60 65 

Asn Phe Thr Leu Thr Asn Leu val ser Arg Tyr Asn Thr Gly Gly Tyr 
70 75 80 85 

Ala Thr val ser Gly Ser ser Gin Ala Ala lie Gly ser Gin lie Cys 

90 95 100 

Arg ser Gly ser Thr Thr Gly Trp His Cys Gly Thr val Gin Ala Arg 

105 110 115 

Gly Gin Thr val ser Tyr Pro Gin Gly Thr val Gin Asn Leu Thr Arg 
120 125 130 

Thr Asn val Cys Ala Glu Pro Gly Asp ser Gly Gly ser phe lie ser 
135 140 145 

Gly Ser Gin Ala Gin Gly val Thr ser Gly Gly Ser Gly Asn Cys Ser 
!50 155 166 165 

Phe Gly Gly Thr Thr Tyr Tyr Gin Glu Val Asn Pro Met Leu Ser Ser 

170 175 180 

Trp Gly Leu Thr Leu Arg Thr 

185 

<210> 11 
<211> 1164 
<212> ONA 

<213> synthetic ("Protease 22") 
<220> 

<22l> cos 

<222> CD . . (1164) 

<220> 

<221> sig_peptide 
<222> CI).. (81) 

<220> 

<221> mis cofeature 

<222> (82).. (576) 

<223> Propeptide 

<220> 

<221> mat_peptide 
<222> (577).. (1164) 

<400> 11 

atg aaa aaa ccg ctg gga aaa att gtc gca age aca gca ctt ctt 

Met Lys Lys pro Leu Gly Lys lie val Ala ser Thr Ala Leu Leu 
-190 -185 -180 

att tea gtg gca ttt age tea tct att gca tea gca get aca gga 
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He ser val Ala Phe Ser ser Ser lie Ala Ser Ala Ala Thr Giy 
-175 -170 -165 

gca tta ccg cag tct ccg aca ccg gaa gca gat gca gtc tea atg 
Ala Leu Pro Gin Ser Pro Thr pro Glu Ala Asp Ala val Ser Met 
-160 -155 -150 

caa gaa gca ctg caa aga gat ctt gat ctt aca tea gca gaa gca 
Gin Glu Ala Leu Gin Arg Asp Leu Asp Leu Thr Ser Ala Glu Ala 
-145 -140 -135 



gaa gaa ctt ctt get gca' caa gat aca gca ttt gaa gtg gat gaa 
Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu val Asp Glu 
•130 -125 -120 



gca gcg gca gaa gca gca gga gat gca tat ggc ggc tea gtt ttt 
Ala Ala Ala Glu Ala Ala Giy Asp Ala Tyr Giy cry Ser val Phe 
-115 -110 -105 

gat aca gaa tea ctt gaa ctt aca gtt ctt gtt aca gat gca gca gca 
Asp Thr Glu Ser Leu Glu Leu Thr val Leu Val Thr Asp Ala Ala Ala 
-100 -95 -90 

gtt gaa gca gtt gaa gca aca gga gca gga aca gta ctt gtt tea tat 
val Glu Ala val Glu Ala Thr Giy Ala Giy Thr val Leu val Ser Tyr 
•85 -80 -75 

gga att gat ggc ctt gat gaa att gtt caa gaa ctg aat gca get gat 
Giy lie Asp Giy Leu asp Glu He val Gin Glu Leu Asn Ala Ala Asp 
-70 -65 ; -60 -55 

get gtt ccg ggc gtt gtt ggc tgg tat ccg gat gtt get gga gat aca 
Ala val pro Giy val val Giy Trp Tyr Pro Asp Val Ala Giy Asp Thr . 

-50 -45 -40 

gtt gtc ctt gaa gtt ctt gaa gga tea ggc gca gat gtt tea ggc ctg 
val val Leu Glu val Leu Glu Giy ser Giy Ala Asp val ser Giy Leu 

-35 *~30 i -25 

ctg gca gac gca gga gtc gat gca tea gca gtt gaa gtt aca aca tea 
Leu Ala asd Ala Giy val Asp Ala ser Ala val Glu val Thr Thr ser 
-20 -is -10 

gat caa ccg gaa ctt tat gca gat att att ggc ggc ctg gca tat tat 
Asp Gin pro Glu Leu Tyr Ala Asp lie He Giy Giy Leu Ala Tyr Tyr 
-5 -11 s io 

atg gac ggc aga tgc age gtt ggc ttt gca gea aca aat gca tea ggc 
Met Giy Giy Arg Cys Ser Val Giy Phe Ala Ala Thr Asn Ala Ser Giy 

15 20 25 

caa ccg ggc ttt gtt aca gca ggc cat tgc ggc aca gtt ggc aca cca 
Gin Pro Giy Phe Val Thr Ala Giy His Cys Giy Thr Val Giy Thr Pro 

30 35 40 

gtt tea att ggc aat ggc aaa ggc gtt ttt gaa cga age att ttt ccg 
val Ser lie Giy Asn Giy Lys Giy val Phe Glu Arg Ser lie Phe Pro 
45 50 ! * 55 

ggc aat gat tea gca ttt gtt aga ggc aca tea aat ttt aca ctt aca 
Giy Asn Asp Ser Ala Phe val Arg Giy Thr Ser Asn Phe Thr Leu Thr 
60 65 I 70 

t 

aat ctg gtt tea aga tat aat tea ggc ggc tat gca aca gtt gca ggc 
Asn Leu val Ser Arg Tyr Asn Ser Giy Giy Tyr Ala Thr val Ala Giy 
75 80 : 85 90 

j 

cat aat caa gca ccg att ggc tea gca gtt tgc aga tea ggc tea aca 
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Ala teu pro Gin Ser Pro Thr Pro Glu Ala Asp Ala val Ser Met 
-160 -155 -150 

9 

Gin Glu Ala Leu Gin Arg Asp Leu Asp LeuThr ser Ala Glu Ala 
-145 -140 -135 

Glu Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu val Asp Glu 
-130 -125 -120 

Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr Gly Gly ser val Phe 
-115 -110 -105 

Asp Thr Glu ser Leu Glu Leu Thr val Leu val Thr Asp Ala Ala Ala 
-100 -95 -90 

* 

val Glu Ala val Glu Ala Thr Gly Ala Gly Thr val Leu val ser Tyr 
-85 -80 -75 

■ 

Gly lie Asp Gly Leu Asp Glu lie val Gin Glu Leu Asn Ala Ala Asp 
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ms Asn Gin Ala Pro lie Gly Ser Ala val Cys Arg Ser Gly ser Thr 

95 100 105 

aca ggc tgg cat tgc ggc aca att caa gca aga aat caa aca gtt agg 942 
Thr Gly Trp His Cys Gly Thr lie Gin Ala Arg Asn Gin Thr val Arg 

110 115 120 

tat ccg caa ggc aca gtt tat agt ctg aca aga aca aca gtt tgt gca 
Tyr Pro Gin Gly Thr val Tyr Ser Leu Thr Arg Thr Thr val Cys Ala 
125 130 135 

gaa ccg ggc gat tea ggc ggc tea tat att age ggc act caa gca caa 
Glu pro Gly Asp Ser Gly Gly Ser Tyr lie ser Gly Thr Gin Ala Gin 
140 K 7 14? 150 

ggc gtt aca tea ggc ggc tea ggc aat tgc agt get ggc ggc aca aca 
Gly val Thr ser Gly Gly Ser Gly Asn cys ser Ala Gly Gly Thr Thr 
155 ' 160 165 170 
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tat tac caa gaa gtt aat ccg atg ctt agt tea tgg ggc ctt aca ctt 1134 
Tyr Tyr Gin Glu val Asn pro Met Leu ser ser Trp G/y Leu Thr Leu 

175 180 185 

aga aca caa teg cat gtt caa tec get cca 1164 
Arg Thr Gin ser His val Gin ser Ala pro 

190 195 

<210> 12 
<211> 388 
<212> PRT 

<213> synthetic ("Protease 22") 
<400> 12 

Met Lys Lys Pro Leu Gly Lys lie val Ala Ser Thr Ala Leu Leu 

-190 -185 . -180 it, 

lie ser val Ala Phe Ser ser ser He Ala Ser Ala Ala Thr Gly 

-175 -170 -165 i: 



It. 
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-70 -65 -60 -5S 

Ala val Pro Gly val Val Gly Trp Tyr Pro Asp val Ala Gly Asp Thr 

-50 -45 -40 

val val Leu Glu Val Leu Glu Gly ser Gly Ala Asp val Ser Gly Leu 

-35 -30 -25 

Leu Ala Asp Ala Gly val Asp Ala Ser Ala val Glu val Thr Thr ser 
-20 -15 -10 

Asp Gin Pro Glu Leu Tyr Ala Asp lie He Gly Gly Leu Ala Tyr Tyr 
-5 -11 5 10 

Met Gly Gly Arg Cys Ser val Gly Phe Ala Ala Thr Asn Ala ser Gly 

15 20 25 

Gin pro Gly Phe val Thr Ala Gly His cys Gly Thr val Gly Thr Pro 

30 35 40 

val ser lie Gly Asn Gly Lys Gly val Phe Glu Arg ser lie Phe Pro 
45 50 55 

Gly Asn Asp ser Ala Phe val Arg Gly Thr ser Asn Phe Thr Leu Thr 
60 65 70 

Asn Leu val ser Arg Tyr Asn ser Gly Gly Tyr Ala Thr Val Ala Gly 
75 80 65 90 

His Asn Gin Ala Pro lie Gly ser Ala val Cys Arg ser Gly ser Thr 

95 100 105 

Thr Gly Trp His Cys Gly Thr lie Gin Ala Arg Asn Gin Thr val Arg 

110 115 120 

Tyr pro Gin Gly Thr val Tyr ser Leu Thr Arg Thr Thr Val Cys Ala 
125 130 135 

Glu Pro Gly Asp Ser Gly Gly ser Tyr lie Ser Gly Thr Gin Ala Gin 
140 145 150 

Gly val Thr ser Gly Gly Ser Gly Asn Cys ser Ala Gly Gly Thr Thr 
i55 160 165 170 

Tyr Tyr Gin Glu Val Asn Pro Met Leu Ser ser Trp Gly Leu Thr Leu 

175 180 * 185 

Arg Thr Gin ser His val Gin Ser Ala Pro 

190 195 
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PROTEASE VARIANTS P\/S 

* 

Field of the Invention 

5 The present invention relates to variants of a parent protease, in particular variants of 

amended properties, such as improved thermostability. The invention also relates to DNA 
sequences encoding such variants, their production in a recombinant host cell, as well as 
methods of using the variants, in particular within the field of animal feed and detergents. The 
invention furthermore relates to methods of generating and preparing protease variants of 
10 amended properties. Preferred parent proteases are the Nocardiopsis proteases comprising 

« 

SEQ ID NOs: 2, 4, 6, 8 and 10. 

Background of the Invention 

Proteases derived from strains, of Nocardiopsis are disclosed in WO 88/03947, WO 
15 01/58276, and DK 1996 00013 ("Protease 10," SEQ ID NOs: 1-2); DK 2003 00912 ("Protease 
08," SEQ ID NOs: 9-10); DK 2003 009.13 ("Protease 11," SEQ ID NOs: 5-6); DK 2003 00914 
("Protease 18," SEQ ID NOs: 3-4); DK 2003 00915 ("Protease 35," SEQ ID NO: 7-8), DD 
2004328, and in JP 02255081. 

It is an object of the present invention to provide novel and improved protease variants, 

t 

20 in particular of amended properties, such as improved thermostability. 

Summary of the Invention 

The present invention relates, to a variant of a parent prptease, comprising a 
substitution in at least one position of at least one region selected from the group of regions 
25 consisting of: 6-18; 22-28; 32-39; 42-58; 62-63; 66-76; 78-100; 103-106; 111-114; 118-131; 
134-136; 139-141; 144-151; 155-156; 160-176; 179-181; and 184-188; wherein 

(a) the variant has protease activity; and 

(b) each position corresponds to a position of SEQ ID NO: 2; and 

(c) the variant has a percentage of identity to SEQ ID NO: 2 of at least 60%. 

t : * 1 ' 

30 The present invention also relates to isolated nucleic acid sequences encoding the 

protease variant and to nucleic acid constructs, vectors, and host cells comprising the nucleic 
acid sequences as well as methods for producing and using the protease variants. 

Brief Description of the Figures 

35 Figure 1 is a multiple alignment of Protease 10, Protease 18, Protease 1 1 , Protease 35 

and Protease 08 (the mature peptide parts of SEQ ID NOs: 2, 4, 6, 8 and 10, respectively); 
and 
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Figure 2 provides the coordinates of the novel 3D structure of Protease 10 (SEQ ID 

► » 

NO: 2) derived from Nocardiopsis sp. NRRL 18262. 

M 
• i i 

1 

i ! 

Detailed Description of the Invention 
5 Three-dimensional Structure of Protease 10 

The structure of Protease 10 was solved in accordance with the principles for X-ray 
crystallographic methods as given, for example, in X-Ray Structure Determination, Stout. G.K. 
and Jensen, L.H.. John Wiley & Sons. Inc. NY, 1989. The structural coordinates for the crystal 
structure at 2.2 A resolution using the isomorphous replacement method are given in Fig. 2 in 
10 standard PDB format (Protein Data Bank, Brookhaven National Laboratory, Brookhaven, CT). 
The PDB file of Fig. 2 relates to the , mature ■ peptide part of Protease 10 corresponding to 
residues 1 -1 88 of SEQ ID NO: 2. 

i • :»■: 

Molecular Dynamics (MD) ! ' ' ' 

15 Molecular Dynamics (MD) simulations are indicative of the mobility of the amino acids 

in a protein structure (see McCammoiji, JA and Harvey, SC., (1987), "Dynamics of proteins 
and nucleic acids", Cambridge University Press). Such protein dynamics' are often compared 
to the crystallographic B-factors (see ? Stout, GH and Jensen, LH. (1989). "X-ray structure 
determination". Wiley). By running th# MD simulation at, e.g., different temperatures, the 
20 temperature related mobility of residues is simulated. Regions having the highest mobility or 
flexibility (here isotropic fluctuations) may be suggested for random mutagenesis. It is here 
understood that the high mobility found in certain areas of the protein, may be thermally 
improved by substituting these residues. 

Using the programs CHARMM i(Accelrys) and NAMD (University of Illinois at Urbana- 
25 Champaign) the Protease 10 structure described above was subjected to MD at 300 and 
400K. Starting from the coordinates of Figure 2 hydrogen and missing heavy atoms were built 
using CHARMM procedures HBUILD and IC BUILD respectively. Then the structure was 
minimized using CHARMM Conjugate Gradients (CONJ) minimization procedure for a total of 

• , .... 

200 steps. The protein was then put on a 70 X 70 X 70 Angstrom box and solvated with TIP3 
30 water molecules. A total of 11124 wat4r molequles were added and theffi minimized, keeping 
the protein coordinates fixed, using blHARMM Adopted Basis Newton Raphson (ABNR) 
minimization procedure for 20000 stjeps. The system was then heated to the desired 
temperature at a rate of 1K every 100 steps using the NAMD software. After an equilibration of 
50 picoseconds, an NVE ensemble MD was run for 1 nanosecond, both steps done with the 
35 software NAMD. A cut-off of 12 Angstrom was used for the non-bonded interactions. Periodic 
boundary conditions were used after the solvation step and for all the subsequent ones. The 
isotropic root mean square (RMS) fluctuations were calculated with the CHARMM procedure 

i 2 

i ■ 

■ » • 

i 
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COORDYNA. ■ 

The following suggested regions for mutagenesis result from MD simulations: From 

residue 160 to 170, from residue 78 to; 90 f from residue 43 to 50, fromi;r£sidue 66 to 75, and 

i . w 

from residue 22 to 28. 

5 . 

1 

Strategy for Preparing Variants < 

Regions of amino acid residues, as well as individual amino acid substitutions, were 
suggested for mutagenesis based on the 3D-structure of Fig. 2 and the alignment of Fig. 1 , 
mainly with a view to improving thermostability. 
10 The following regions were suggested, cf. claim 1: 6-18; 22-28; 32-39; 42-58; 62-63; 

66-76; 78-100; 103-106; 111-114; 118-131; 134-136; 139-141; 144-151; 155-156; 160-176; 

179-181; and 184-188.: 

At least one of the following positions of the above regions are preferably subjected to 

mutagenesis, cf. claim 3: 6; 7; 8; 9; 10;i12; 13; 16; 17; 18; 22; 23; 24; 25; 26; 27; 28; 32; 33; 
is 37; 38; 39; 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52; 53; 54; 55; 56; 58; 62; 63; 66; 67; 68; 69; 

70; 71; 72; 73; 74; 75; 76; 78; 79; 80; 8.1; 82; 83; 84; 85; 86; 87; 88; 89; 90; 91; 92; 93; 94; 95; 

96; 97; 98; 99; 100; 103; 105; 106; 111; 113; 114; 118; 120; 122; 124; 125; 127; 129; 130; 

131; 134; 135; 136; 139; 140; 141; 144; 145; 146; 147; 148; 149; 150; 15;fl; 155; 156; 160; 

161; 162; 163; 164; 165; 166; 167; 168j 169; 170; 171; 172; 173; 174; 1755; 176; 179; 180; 
20 181; 184; 185; 186; 187; and/or 188. 

Contemplated specific variants are listed in the claims, viz. variants of Protease 10, 

Protease 18, Protease 11, Protease 35 as well as Protease 08 in claims 4 and 15; variants of 

Protease 10 in claim 16; variants of Protease 18 in claim 17; variants of Protease 11 in claim 

18; variants of Protease 35 in claim 19; and variants of Protease 08 in claim 20. Subgroups of 
25 specific variants are listed in claims 21 -23. 

The various concepts underlying the invention are also reflected in the claims as 

follows: Stabilization by disulfide-bridges in claims 5 and 6; proline-stabilization in claims 7-8; 

substitution of exposed neutral residues with negatively charged residues in claims 9-10; 

substitution of exposed neutral residues with positively charged resides in claims 1.1-12; 
30 substitution of small residues with bulkier residues inside the protein in claim 13; and regions 

proposed for mutagenesis following MD simulations in claim 14. 

The term "at least one" means tone or 'more," viz., e.g. in the context of regions: One, 

two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, 

sixteen, or seventeen; or, in the context of positions or substitutions: One, two, three, four, 
35 five, and so on, up to e.g. ninety. 

In a particular embodiment, the number of regions proposed for and/or subjected to 

mutagenesis is at least one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, 
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thirteen, fourteen, fifteen, sixteen, or at least seventeen. 

In another particular embodiment, the number of regions proposed for and/or subjected 
to mutagenesis is no more than one, tyo, three, four, five, six, seven, e'fght, nine, ten, eleven, 
twelve, thirteen, fourteen, fifteen, sixteen, or no more than seventeen. ;<* 

6, i . • 

i ' 

Polypeptides Having Protease Activity j* 

Polypeptides having protease activity, or proteases, are somet mes also designated 
peptidases, proteinases, peptide hydrolases, or proteolytic enzymes. Proteases may be of the 
exo-type that hydrolyse peptides starting at either end thereof, or of the endo-type that act 
10 internally in polypeptide chains (endopeptidases). Endopeptidases show activity on N- and C- 
terminally blocked peptide substrates that are relevant for the specificity of the protease in 
question. 

The term "protease" is defined herein as an enzyme that hydrolyses peptide bonds. 

This definition of protease also applies to the protease-part of the terms "parent protease 0 and 
15 "protease variant," as used herein. The term "protease" includes any enzyme belonging to the 

EC 3.4 enzyme group (including each of the thirteen subclasses thereof). The EC number 

refers to Enzyme Nomenclature 1992 from NC-IUBMB, Academic! i Press, San Diego, 

California, including supplements 1-5 published in Eur. J. Bio-chem. tdW, 223, 1-5; Eur. J. 

Biochem. 1995, 232, 1-6; Eur. J. Biochem. 1996, 237, 1-5; Eur. J. Biophem. 1997, 250, 1-6; 
20 and Eur. J. Biochem. 1999, 264, 610-650; respectively. The nomenclature is regularly 

supplemented and updated; see e.g. the World Wide [tWeb (WWW) at 

http://www.chem.qmw.ac.uk/iubmb/enzyme/index.html. 

Proteases are classified on the basis of their catalytic mechanism into the following 

■ * 

groups: Serine proteases (S), Cysteine proteases (C), Aspartic proteases (A), Metallo prote- 
25 ases (M), and Unknown, or as yet unclassified, proteases (U), see Handbook of Proteolytic 
Enzymes, A.J.Barrett, N.D.Rawlings, J.F.Woessner (eds), Academic Press (1998), in particu- 
lar the general introduction part. 

In particular embodiments, the parent proteases and/or the protease variants of the 
invention and for use according to the invention are selected from the group consisting of: 
30 (a) Proteases belonging to the EC 3.4.-.- enzyme group; 

(b) Serine proteases belonging to the S group of the above Handbook; 
(d) Serine proteases of peptidase family S2A; and >«u ! 

(c2) Serine proteases of peptidase family S1E as described irt Biochem J. 290:205- 
218 (1993) and in MEROPS protease database, release 6.20^ March 24, 2003, 

35 (www.merops.ac.uk). The database is jdescribed in Rawlings, N.D., O'Bfien, E. A. ;& Barrett, 

» *■ > 

A. J. (2002) MEROPS: the protease database. Nucleic Acids Res. 30, 34&-346. 

For determining whether a given protease is a Serine protease, and a family S2A pro- 
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tease, reference is made to the above Handbook and the principles indicated therein. Such 
determination can be c.arried out for all types of proteases, be it naturally occurring or wild-type 
proteases; or genetically engineered or synthetic proteases. 

Protease activity can be measured using any assay, in which a substrate is employed, 
5 that includes peptide bonds relevant fdr the specificity of the protease j(n question. Assay-pH 
and assay-temperature are likewise toj be adapted to the protease in question! Examples of 
assay-pH-values are pH 2, 3, 4, 5, 6, 7, 8, 9, 10, 11. or 12. Examples ol assay-temperatures 
are 30, 35, 37, 40, 45. 50, 55, 60, 65, t 70, 80, ; 90, or 95"C. Examples of protease substrates 

are casein, such as Azurine-Crosslinked Casein (AZCL-casein). Examples of suitable 

\ ■■ ■ 

10 protease assays are described in Exaniple 1. !]";; 

i 

Parent Protease 

The parent protease is a protease from which the protease variant is, or can be, 
derived. For the present purposes, any protease can be used as the parent protease, as long 
15 as the resulting protease variant is homologous to Protease 10, i.e. the protease derived from 
Nocardiopsis sp. NRRL 18262 and comprising amino acids 1-188 of SEQ ID NO: 2. 

In a particular embodiment the parent protease is also homologous to Protease 10. 

In the present context, homologous means having an identity of at least 60% to SEQ 
ID NO: 2, viz. amino acids 1-188 of the mature peptide part of Protease 10. Homology is 
20 determined as generally described beloto in the section entitled Amino Acid Homology. 

The parent protease may be a wild-type or naturally occurring polypeptide, or an allelic 
variant thereof, or a fragment thereof that has protease acticity, in particular a mature part 
thereof. It may also be a variant thereof and/or a genetically engineered or synthetic 
polypeptide. \ 

25 In a particular embodiment the Wild-type parent protease is i) a bacterial protease; ii) a 

> • 

protease of the phylum Actinobacteria; iii) of the class Actinobacteria; iv) of the order 
Actinomycetales v) of the family Nocardiopsaceae; vi) of the genus Nocardiopsis] and/or a 
protease derived from vii) Nocardiopsis species, such as Nocardiopsis alba, Nocardiopsis 
antarctica, Nocardiopsis composta, Nocardiopsis dassonvillei, Nocardiopsis exhalans, 

' i 

1 

30 Nocardiopsis halophila, Nocardiopsis halototerans, Nocardiopsis kunsanensis, Nocardiopsis 
listen, Nocardiopsis lucentensis, Nocardiopsis metallicus, Nocardiopsis prasina, Nocardiopsis 
sp., Nocardiopsis sfrnnemataformans, Nocardiopsis trehalose Nocardiopsis tropica, 
Nocardiopsis umidischolae, or Nocardiopsis xinjiangensis. " ; 

Examples of such strains are: Mocardiopsis alba DSM 15647 (wild-type producer of 

35 Protease 08), Nocardiopsis dassonviltei NRRL 1 18133 (wild-type producer of Protease M58-1 
described in WO 88/03947), Nocardiopsis dassonvillei subsp. dasson\Zfiiei DSM 43235 (wild- 
type producer of Protease 18), Nocardiopsis prasina DSM 15648 C^i'd-type produced of 

5 

• l 

\ \ 
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Protease 11), Nocardiopsis prasina DSM 15649 (wild-type producer of Protease 35). 
Nocardiopsis sp. NRRL 18262 (wild-type producer of Protease 10), Nocardiopsis sp. strain 
PERM P-10508 (described in JP 02255081), or Nocardiopsis dassonvillei strain ZIMET 43647 

(described in DD 2,004,328). 
5 Strains of these species are accessible to the public in a number of culture collections, 

such as the American Type Culture Collection (ATCC), Deutsche Sammlung von 

Mikroorganismen und Zellkulturen GmbH (DSMZ), Centraalbureau Vqor Schimmelcultures 

(CBS), and Agricultural Research Service, Patent Culture Collectioi^ Northern; Regional 

Research Center (NRRL), e.g. Nocardiopsis dassonvillei subsp. dqsscjhvillei DSM 43235 is 
10 publicly available from DSMZ (Deutsqhe Sammlung von Mikroorganis^jien und Zellkulturen 

GmbH, Braunschweig, Germany). The strain was also deposited)^ :at other depositary 

institutions as follows: ATCC 2321 9, IMRU 1 250, NCTC 10489. 

The following biological materials were deposited in connection with the filing of other 

patent applications under the terms of the Budapest Treaty with the Agricultural Research 
15 Culture Collection (NRRL). Peoria. US. or the Deutsche Sammlung von Mikroorganismen und 

Zellkulturen GmbH, Mascheroder Weg 1b, D-38124 Braunschweig, and given the following 

accession numbers: 

Deposit Accession Number Date of Deposit 

Nocardiopsis sp. NRRL 18262 November 10, 1987 

20 Nocardiopsis dassonvillei NRRL 18133 November 13, 1986 

Nocardiopsis alba DSM 15647 ,; May 30, 2003 

Nocardiopsis prasina DSM 15648 . May 30. 2003 

Nocardiopsis prasina DSM 15649 ; . |!-May 30. 2003 

These strains have been deposited under conditions that assilre that access to the 

25 cultures will be available during thej pendency of the other patent applications to one 
determined by the Commissioner of Patents and Trademarks to be entitled thereto under 37 

- 

C.F.R. §1.14 and 35 U.S.C. §122. The deposits represent substantially pure cultures of the 
deposited strains. The deposits are available as required by foreign patent laws in countries 
wherein counterparts of these applications, or their progeny are filed. However, it should be 
30 understood that the availability of a deposit does not constitute a license to practice the 
invention in derogation of patent rights granted by governmental action. 

Of the above strains, the three last-mentioned were isolated in 2001 from soil samples 

* 

from Denmark. 

Furthermore, such polypeptide? may be identified and obtained from other sources 
35 including microorganisms or DNA isolated from nature (e.g., soil, composts, water, etc.) using 
suitable probes. Techniques for isolating microorganisms or DNA frorp; natural habitats are 
well known in the art. The nucleic acid Sequence may then be derived ^similarly screening a 

i J 1 

• 6 « ■ r 

; ill 

i # » . 
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genomic or cDNA library of another microorganism. Once a nucleic acid' sequence encoding a 
polypeptide has been detected with the probe(s), the sequence may be isolated or cloned by 
utilizing techniques which are known to those of ordinary skill in the art (see, e.g., Sambrook ef 
a/., 1989, supra). 

5 The parent protease may be a mature part of any of the amino acid sequences 

referred to above. A mature part means a mature amino acid sequence and refers to that part 
of an amino acid sequence which remains after a potential signal peptide part and/or pro- 
peptide part has been cleaved off. ? 

The parent protease may also bje a fragment of a specified amind acid sequence, viz. a 

10 polypeptide having one or more amino;acids deleted from the amino arid/or carboxyl terminus 
of this amino acid sequence. In one embodiment, a fragment contains^ least 80, ttr at least 
90, or at least 100, or at least 110, or at least 120, or at least 130, or at Jeast 140, or at least 
150, or at least 160, or at least 170, or at least 180, or at least 185 amino acid residues. 

The parent protease may also be an allelic variant, allelic referring to the existence of 

15 two or more alternative forms of a gene occupying the same chromosomal locus. Allelic 

* ■ 

variation arises naturally through mutation, and may result in polymorphism within populations. 
Gene mutations can be silent (no change in the encoded polypeptide) or may encode 
polypeptides having altered amino acid sequences. An allelic variant of a polypeptide is a 
polypeptide encoded by an allelic variant of a gene. 

20 In another embodiment, the parent protease may be a genetically engineered 

protease, e.g. a variant of the wild-type or natural parent proteases referred to above 
comprising a substitution, deletion, and/or insertion of one or more amino acids. In other 
words: The parent protease may itself be a protease variant. The amino acid sequence of 
such parent protease may differ from ithe amino acid sequence specified by an insertion or 

25 deletion of one or more amino acid residues and/or the substitution of one or more amino acid 
residues by different amino acid residues. The amino acid changes may] be of a minor, or of a 
major, nature. Amino add changes of a major nature are e.g. those; resulting In a variant 
protease of the present invention with amended properties. In another particular embodiment, 
the amino acid changes are of a minor nature, that is conservative amino acid substitutions 

30 that do not significantly affect the folding and/or activity of the protein; srnall deletions, typically 
of one to about 30 amino acids; small amino- or carboxyl-terminal extensions, such as an 
amino-terminal methionine residue; a small linker peptide of up to about 20-25 residues; or a 
small extension that facilitates purification by changing net charge or another function, such as 
a poly-histidine tract, an antigenic epitope or a binding domain. 

35 Examples of conservative substitutions are within the group of basic amino acids 

(arginine, lysine and histidine), acidic amino acids (glutamic acid and aspartic acid), polar 
amino acids (glutamine and asparagiiie), hydrophobic amino acids (leucine, isoleucine and 
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valine), aromatic amino acids (phenylalanine, tryptophan and tyrosine), iand small amino acids 

(glycine, alanine, serine, threonine arid methionine). Amino acid substitutions which do not 

generally alter the specific activity are ;known in the art and are described, for example, by H. 

Neurath and R.L Hill, 1979, In, The Proteins, Academic Press, New York. The most 
5 commonly occurring exchanges are Ula/Ser, Val/lle, Asp/Glu, Thr/Ser, Ala/Gly, AlafThr, 

Ser/Asn, AlaA/al, Ser/Gly, Tyr/Phe, Ala/Pro, Lys/Arg. Asp/Asn, Leu/lle, Leu/Val, Ala/Glu, and 

Asp/Gly as well as these in reverse. 

Still further examples of genetically engineered parent proteases are synthetic 

proteases, designed by man, and expectedly not occurring in nature. EP 897985 discloses a 
10 process of preparing a consensus protein. Shuffled proteases are other examples of synthetic 

or genetically engineered parent proteases, which can be prepared as is generally known in 

< 

the art, eg by Site-directed Mutagenesis, by PCR (using a PCR fragment containing the 
desired mutation as one of the primers in the PCR reactions), or by Random Mutagenesis. 

« w « 

Included in the concept of a synthetic protease is also any hybrid or chimeric protease, i.e. a 
15 protease which comprises a combination of partial amino acid sequences derived from at least 

two proteases. j j*j • 

In further particular embodiments, the parent protease comprises, or consists of, 

respectively, the amino acid sequence specified, or an allelic variant thereof; or a fragment 

i 

thereof that has protease activity. 
20 In still further particular embodiments, the protease variant of the invention is not 

identical to: 

(i) amino acids 1-188 of SEQ ID NO: 2, amino acids 1-188 of SEQ ID NO: 4, amino 
acids 1-188 of SEQ ID NO: 6, amino adds 1-188 of SEQ ID NO: 8, and amino acids 1-188 of 
SEQ ID NO: 10; 

25 (ii) the protease 1 derived from Nocardiopsis dassonvillei NRRL 1 81 33; 

(iii) the protease derived from Nocardiopsis sp. PERM P-10508; ; 

(iv) the protease derived from Nocardiopsis dasson villei strain Zl^ET 43647; and/or 

(v) any prior art protease of a percentage of identity to SEQ ID NO: 2 of at leSst 60%. 

t • II 

30 Microorganism Taxonomy a . ! 

' * *•* • 

Questions relating to taxonomy may be solved by consulting a( 'taxonomy data base, 

such as the NCBI Taxonomy Browser which is available at the fdllowing internet site: 

! >• • 

http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/, and/or by consulting Taxonomy 

i 

handbooks. For the present purposes, the taxonomy is preferably according to the chapter: 
35 The road map to the Manual by G.M. Garrity & J. G. Holt in Berge/s Manual of Systematic 
Bacteriology, 2001, second edition, volume 1, David R. Bone, Richard W. Castenholz. 
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Amino Acid Homology \ t ;-j 

The present invention refers to proteases, viz. parent proteases, and/or protease 
variants, having a certain degree of identity to SEQ ID NO: 2, suchHparent and/or variant 
proteases being hereinafter designated; "homologous proteases". ; ( ; ; 
5 For purposes of the present intention the degree of identity between two amino acid 

sequences, as well as the degree of identity between two nucleotide sequences, is determined 
by the program "align" which is a Needleman-Wunsch alignment (i.e. a global alignment). The 
program is used for alignment of polypeptide, as well as nucleotide sequences. The default 
scoring matrix BLOSUM50 is used for polypeptide alignments, and the default identity matrix is 

10 used for nucleotide alignments. The penalty for the first residue of a gap is -12 for 
polypeptides and -16 for nucleotides. The penalties for further residues of a gap are -2 for 
polypeptides, and -4 for nucleotides. 

"Align" is part of the FASTA package version v20u6 (see W. R. Pearson and D. J. 
Lipman (1988), "Improved Tools for Biological Sequence Analysis", PNAS 85:2444-2448, and 

15 W. R. Pearson (1990) "Rapid and Sensitive Sequence Comparison wit&FASTP and FASTA," 
Methods in Enzymology 183:63-98). FASTA protein alignments use'-the Smith-Waterman 
algorithm with no limitation on gap size (see "Smith-Waterman algorithrn", T. F. Smith and M. 
S. Waterman (1981) J. Mol. Biol. 147:195-197). ; ; 

Multiple alignments of protein sequences may be made using "£lustalW (Thompson, 

20 J.D., Higgins, D.G. and Gibson, T.jj. (1994) CLUSTAL W: improving the sensitivity of 

i 

progressive multiple sequence alignment through sequence weighting, positions-specific gap 
penalties and weight matrix choice. Nucleic Acids Research, 22:4673-4680). Multiple 
alignment of DNA sequences may be done using the protein alignment as a template, 
replacing the amino acids with the corresponding codon from the DNA sequence. 
25 In particular embodiments, the. homologous protease has an amino acid sequence 

which has a degree of identity to SEQ ID NO: 2 of at least 60%, 62%, 64%, 66%, 68%, 70%, 
72%, 74%, 76%, 78%, 80%, 82%, 84%, 86%, 88%, 90%, 92%, 94%, 96%, 98%, or of at least 

< 

about 99%. , . 

In alternative embodiments, the homologous protease has an .amino acid sequence 

30 which has a degree of identity to SEQ |D NO: 2 of at least 50%, 51%. 52%. 53%. 54%. 55%, 
56%, 57%. 58%, or at least 59%. : ' j»~ 

In another particular embodiment, the parent protease, and/orthe protease variant, 
comprises a mature amino acid sequence which differs by no mp|e than seventyfive, 
seventyfour, seventythree, seventytwo,; seventyone. seventy, sixtynine, ;fe1xtyeight, sixtyseven, 

35 sixtysix, sixtyfive, sixtyfour, sixtythree, jsixtytwo, sixtyone, sixty, fiftynin^ fiftyeight, fiftyseven, 
fiftysix, fiftyfive, frftyfour, fiftythree, fiftytwo, fiftyone, fifty, fourtynine, fourtyeight, fourtyseven, 
fourtysix, fourtyfive, fourtyfour, fourtythree, fourtytwo, fourtyone, fourty, thirtynlne, thirtyeight, 

9 
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thirtyseven, thirtysix, thirtyfive, thirtyfour, thirtythree, thirtytwo, thirtyone, thirty, twentynine, 
twentyeight, twentyseven, twentysix, twentyfive, twentyfour, twentythree, twentytwo, 
twentyone, twenty, nineteen, eighteen; seventeen, sixteen, fifteen, fourteen, thirteen, twelve, 
eleven, ten, nine, eight, seven, six, fivej four, three, by no more than two^or only by one amino 

5 acid(s) from the specified amino acid sequence, e.g. SEQ ID NO: 2. * ' 

In a still further particular embodiment, the parent protease;* and/or the protease 
variant, comprises a mature amino acid sequence which differs by at least seventyfive, 
seventyfour, seventythree, seventytwoj seventyone. seventy, sixtynine.Vsjixtyeight, sixtyseven, 
sixtysix, sixtyfive, sixtyfour, sixtythree.jsixtytwo, sixtyone, sixty, fiftyninei' fiftyeight, fiftyseven, 

10 fiftysix, fiftyfive, fiftyfour, fiftythree. fiftjjtwo, fiftyone. fifty, fourtynine, fourtyeight. fourtyseven, 
fourtysix, fourtyfive, fourtyfour, fourtythree, fourtytwo, fourtyone, fourty. thirtynine, thirtyeight, 
thirtyseven, thirtysix, thirtyfive, thirtyfour, thirtythree, thirtytwo, thirtyone, thirty, twentynine, 
twentyeight, twentyseven, twentysix, twentyfive, twentyfour, twentythree, twentytwo, 
twentyone, twenty, nineteen, eighteen, seventeen, sixteen, fifteen, fourteen, thirteen, twelve, 

15 eleven, ten, nine, eight, seven, six. five, four, three, by at least two. or by one amino acid(s) 

* 

from the specified amino acid sequence, e.g. SEQ ID NO: 2. 



♦i 

ii 



Nucleic Acid Hybridization 

In the alternative, homologous parent proteases, as well as variant proteases, may be 
20 defined as being encoded by a nucieic acid sequence which hybridizes under very low 

i i . 

stringency conditions, preferably iofa stringency conditions, more* preferably medium 
stringency conditions, more preferably medium-high stringency conditions, even more 
preferably high stringency conditions, and most preferably very high stridency conditions with 
nucleotides 900-1466 of SEQ ID NO: 1!, or a subsequence or a complementary strand thereof 

25 (J. Sambrook, E.F. Fritsch, and T. Mahiatus, 1989, Molecular Cloning, A Laboratory Manual, 
2d edition, Cold Spring Harbor, New York). A subsequence may be at least 100 nucleotides, 
or at least 200, 300, 400, or at least 500 nucleotides. Moreover, the subsequence may encode 
a polypeptide fragment that has the relevant enzyme activity. 

For long probes of at least 100 nucleotides in length, very low to very high stringency 

30 conditions are defined as prehybridization and hybridization at 42°C in 5X SSPE, 0.3% SDS, 
200 jig/ml sheared and denatured salmon sperm DNA, and either 25% formamide for very low 
and low stringencies, 35% formamide for medium and medium-high Stringencies, or 50% 
formamide for high and very high] stringencies, following standard Southern blotting 

; i ' 

procedures. ; '■ \i- - . 

35 For long probes of at least 1Q0 nucleotides in length, the easier material is finally 

washed three times each for 15 minute's using 2 x SSC, 0.2% SDS preferably at least at 45°C 
(very low stringency), more preferably? at least at 50°C (low stringency), more preferably at 

10 
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least at 55'C (medium stringency), more preferably at least at 60°C (medium-high stringency), 
even more preferably at least at 65°C (high stringency), and most preferably at least at 70'C 

(very high stringency). : 

For short probes which are about 15 nucleotides to about 70 nucleotides in length, 
stringency conditions are defined as prehybridization. hybridization, " and washing post- 
hybridization at 5°C to 10°C below the 'calculated T m using the calculation according to Bolton 
and McCarthy (1962. Proceedings ofti?e National Academy of Scienc^^JSA 48:1 ?90) in 0.9 
M NaCI, 0.09 M Tris-HCI pH 7.6. 6 njM EDTA. 0.5% NP-40. 1X Denhjardt's solution. 1 mM 
sodium pyrophosphate, 1 mM sodium! monobasic phosphate, 0.1 mM/MATP, and 0.2 mg of 
yeast RNA per ml following standard Southern, blotting procedures. 0 <. 

For short probes which are about 15 nucleotides to about 70 nucleotides in length, the 
carrier material is washed once in 6X SCC plus 0.1% SDS for 15 minutes and twice each for 
1 5 minutes using 6X SSC at 5°C to 1 0*C below the calculated T m . 

Position numbering 

In the present context, the basis for numbering positions is SEQ ID NO: 2, Protease 
10, starting with A1 and ending with T188, see Fig. 1. A parent protease, as well as a variant 
protease, may comprise extensions as compared to SEQ ID NO: 2, i.e. in the N-terminal, 
and/or the C-terminal ends thereof. The amino acids of such extensions, if any. are to be 
numbered as is usual in the art, i.e. for a C-terminal extension: 189. 190,, (191 and sq forth, and 
for an N-terminal extension -1 , -2, -3 and so forth. h , \ 

< . : \v ■ 

Alterations, such as Substitutions, Deletions, Insertions \{\ 

In the present context, the following are examples of various wa^vs in which a protease 
variant can be designed or derived from a parent amino acid sequence:;^ amino acid can be 
substituted with another amino acid; an amino acid can be deleted; ah amino acid can be 
inserted; as well as any combination of any number of such alterations. 

For the present purposes, the term substitution is intended to include any number of 
any type of such alterations. This is a reasonable definition, because, for example, a deletion 
can be regarded as a substitution of an amino acid. AA, in a given position, nn. with nothing. 
(). Such substitution can be designated: AAnn(). Likewise, an insertion of only one amino acid. 
BB, downstream an amino acid. AA, in, a given position, nn, can be designated: ()nnaBB. And 

■ 

if two amino acids, BB 'and CC, are inserted downstream of amino acid AA in position nn, this 
substitution (combination of two substitutions) can be designated: ()nnaBB+()nnbCC, the thus 
created gaps between amino acids nn and nn+1 in the parent sequence being assigned lower 

case or subscript letters a, b, c etc! to the former position number; here nn.' A similar 

I.- 

numbering procedure is followed wheri aligning a new sequence to the* multiple alignment of 

l 5 • 
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Fig. 1, in case of a gap being created by the alignment between amino acids nn and nn+1: 
Each position of the gap is assigned a number nna, nnb etc.. A comma (,) between 
substituents, as e.g. in the substitution T129E.D.Y.Q means "either or", i.e. that T129 is 
substituted with E, or 0, or Y, or Q. A plus-sign (+) between substitutions, e.g. 129D+135P 
5 means "and", i.e. that these two single substitutions are combined in one and the same 
protease variant 

■ * 

In the present context, the term "a" substitution" means at least one substitution. At 
least one means one or more, e.g. one, or two, or three, or four, or five|, or six, or. seven, or 
eight or nine, or ten, or twelve, or fourteen, or fifteen, or sixteen, or eighteen, or .twenty, or 

! • * if" 

10 twentytwo or twentyfbur, or twentyfive*. or twenty eight, or thirty, and 'so on, to include in 
principle, any number of substitutions, jrhe variants of the invention, hojrvpver, still have to be, 
e.g., at least 60% identical to SEQ ID ISIO: 2, this percentage being determined by the above- 
mentioned program. The substitutions, can be applied to any position encompassed by any 
region mentioned in claim 1, and variants comprising combinations of any number and type of 

15 such substitutions are also included. The term substitution as used herein also include 
deletions, as well as extensions, or insertions, that may add to the length of the sequence 

corresponding to SEQ ID NO: 2. 

Furthermore, the term "a substitution" embraces a substitution into any one of the other 
nineteen natural amino acids, or into other amino acids, such as non-natural amino acids. For 
20 example, a substitution of amino acid T in position 22 includes each of the following 
substitutions: 22A, 22C, 22D, 22E, 22F, 22G, 22H, 221, 22K, 22L, 22M, 22N, 22P, 22Q, 22R, 
22S. 22V, 22W, and 22Y. This is, by the way, equivalent to the designation 22X, wherein X 
designates any amino acid. These substitutions can also be designated; T22A, T22C, T22X, 

etc. The same applies by analogy to each and every position mentione^perein, to specifically 

» * * • ^ 

25 include herein any one of such substitutions. . >• ' 

* - 

Identifying Corresponding Position Numbers 

For each amino acid residue in each parent or variant protease, of the invention and/or 
for use according to the invention, it is possible to directly and unambiguously assign an amino 
30 acid residue in SEQ ID NO: 2 to which it corresponds. Corresponding residues are assigned 

i 

the same number, by reference to the Protease 10 sequence. 

As it appears from the numbering of Fig. 1, in conjunction with the numbering of the 
sequence listing, for each amino acid residue of each of the proteases Protease 10, Protease 
18, Protease 11, Protease 35, and Protease 08, the corresponding amino acid residue in SEQ 
35 ID NO: 2 has the same number This number is easily derivable from Fig. ,1. At least in case of 
these five proteases, the number is the same as the number assigned to this amino acid 
residue in the sequence listing for the rriature part of the respective protease. 
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For a given position in another protease - be it a parent or ^. variant protease - a 
corresponding position, of SEQ ID NO: 52 can always be found, as follow^ 

The amino acid sequence of another parent protease, or, in turn.'of a variant protease 
amino acid sequence, is designated SEQ-X. A position corresponding to position N of SEQ ID 
5 NO: 2 is found as follows: The parent or variant protease amino acid sequence SEQ-X is 
aligned with SEQ ID NO: 2 as specified above in the section entitled Amino Acid Homology. 
From the alignment, the position in sequence SEQ-X corresponding to position N of SEQ ID 
NO: 2 can be clearly and unambiguously derived, using the principles described below. 

SEQ-X may be a mature part- of the protease in question, or it may also include a 
10 signal peptide part, or it may be a fragment of the mature protease; which has protease 
activity, e.g. a fragment of the same length as SEQ ID NO: 2, and/or $*nay be the fragment 
which extends from A1 to f 188 when ajigned with SEQ ID NO: 2 as described herein. 



Region and 

15 in the present context, the term region means at least one; position of a parent 

protease amino acid sequence, the tefm position designating an aminpiacid residue of such 
amino acid sequence. In one embodiment, region means one or more successive positions of 
the parent protease amino acid sequence, e.g. one, two, three, four, five, six, seven, eight, 
etc., up to any number of consecutive positions of the sequence. Accordingly, a region may 

20 consist of one position only, or it may consist of any number of consecutive positions, such as, 
e.g., position no. 62 and 63; or position no. 111, 112, 113 and 114. For the present purposes, 
these two regions are designated 62-63, and 111-114, respectively. The boundaries of these 
regions or ranges are included in the region. 

A region encompasses specifically each and every position it embraces. For example, 

25 region 111-114 specifically encompasses each of the positions 111, 113, and 114. The 
same applies by analogy for the other regions mentioned herein. 

Thermostability 

For the present purposes, the term thermostable as applied in the context of a certain 
30 polypeptide, refers to the melting temperature, Tm, of such polypeptide?, as determined using 
Differential Scanning Calorimetry (DSC) in 10mWl sodium phosphate, 50 mM sodium chloride, 
pH 7.0, using a constant scan rate of 1.5°C/min. 

The following Tm's were determined under the above conditions: 76.5°C (Protease 10), 
83.0°C (Protease 18), 78.3°C (Protease 08), 76.6°C (Protease 35), and 73.7°C (Protease 1 1). 
35 For a thermostable polypeptide, the Tm is at least 83.1°C. In particular embodiments, 

the Tm is at least 84, 85, 86, 87, 88, 89, 90, 91. 92, 93, 94, 95, 96, 97, 98, 99 or at least 

I * * 

100°C. 
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In the alternative, the term thermostable refers to a melting temperature of at least 
73.8, or at least 76.7°C, or at least 78.4°C, preferably at least 74, 75, 76, 77, 78, 79, 80, 81, 
82, or at least 83°C, still as determinedlusing DSC at a pH of 7.0. j 

For the determination of Tm, ajsample of the polypeptide with a -purity of at least 90% 
5 (or 91 , 92, 93, 94, 95, 96, 97, or 98%) as determined by SDS-PAGE may.be used. Still further, 
the enzyme sample may have a concentration of between 0.5 and 2.5 mg/ml protein (or 
between 0.6 and 2.4, or between 0.7 and 2.2, or between 0.8 and 2.0 mg/ml protein), as 
determined from absorbance at 280 nm and based on an extinction coefficient calculated from 
the amino acid sequence of the enzyme in question. 
10 The DSC takes place at the desired pH (e.g. pH 5.5, 7.0, 3.0, or 2.5) and with a 

constant heating rate, e.g. of 1, 1.5, 2, 3, 4, 5, 6, 7, 8, 9 or 10°C/min. 

i 

In a particular embodiment, the protease variant of the invention is more thermostable 
than the parent protease. A preferred parent protease for this purpose is Protease 18. 



15 Low-allergenic 

In a specific embodiment, the protease variants of the present inyention are (also) low- 
allergenic variants, designed to invoke a reduced Immunological response when exposed to 
animals, including man. The term immunological response is to be understood as any reaction 
by the immune system of an aninjal exposed to the protease variant. One type of 
20 immunological response is an allergic response leading to increased levels of IgE in the 
exposed animal. Low-allergenic variants may be prepared using techniques known in the art. 
For example the protease variant may be conjugated with polymer moieties shielding portions 
or epitopes of the protease variant involved in an immunological response. Conjugation with 
polymers may involve in vitro chemical coupling of polymer to the protease variant, e.g. as 
i described in WO 96/17929, WO 98/30682, WO 98/35026, and/or WO 99/00489. Conjugation 
may in addition or alternatively thereto involve in vivo coupling of polymers to the protease 
variant. Such conjugation may be achieved by genetic engineering of the .nucleotide sequence 
encoding the protease variant, inserting consensus sequences ' encoding additional 
glycosylation sites in the protease variant and expressing the protease variant in a host 
capable of glycosylating the protease variant, see e.g. WO 00/263*54. Another way of 
providing low-allergenic variants is genetic engineering of the nucleotide sequence encoding 
the protease variant so as to cause tije protease variants to self-oligomerize, effecting that 
protease variant monomers may shield the epitopes of other protease variant monomers and 
thereby lowering the antigenicity of the oligomers. Such products and' their preparation is 
described e.g. in WO 96/16177. Epitopes involved in an immunological response may be 
identified by various methods such as the phage display method described in WO 00/26230 
and WO 01/83559, or the random approach described in EP 561907. Once an epitope has 
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been identified, its amino acid sequence may be altered to produce faltered immunological 
properties of the protease variant by known gene manipulation techniques such as site 
directed mutagenesis (see e.g. WO 00/26230, WO 00/26354 and/or |Wo 00/22103) and/or 
conjugation of a polymer may be done; in sufficient proximity to the epijope for the polymer to 
5 shield the epitope. \ / ir, 

Nucleic Acid Sequences and Constructs 

The present invention also relates to nucleic acid sequences comprising a nucleic acid 
sequence which encodes a protease variant of the invention. 

o The term "isolated nucleic acid sequence" refers to a nucleic acid sequence which is 

essentially free of other nucleic acid sequences, e.g., at least about 20% pure, preferably at 
least about 40% pure, more preferably at least about 60% pure, even more preferably at least 
about 80% pure, and most preferably at least about 90% pure as determined by agarose 
electrophoresis. For example, an isolated nucleic acid sequence can be obtained by standard 

5 cloning procedures used in genetic engineering to relocate the nucleic add sequence from its 
natural location to a different site where it will be reproduced. The cloning procedures may 
involve excision and isolation of a desjired nucleic acid fragment composing the nucleic acid 
sequence encoding the polypeptide, insertion of the fragment into ajjvector molecule, and 

incorporation of the recombinant vector into a host cell where multiple qppies or clones of the 

ii'i 

:o nucleic acid sequence will be replicated. The nucleic acid sequence r may be of genomic, 
cDNA, RNA, semisynthetic, synthetic origin, or any combinations thereof.!) 

The nucleic acid sequences of! the invention can be prepared by introducing at least 
one mutation into the parent protease coding sequence or a subsequence thereof, wherein the 
mutant nucleic acid sequence encodes a variant protease. The introduction of a mutation into 

>5 the nucleic acid sequence to exchange one nucleotide for another nucleotide may be 
accomplished by site-directed mutagenesis using any of the methods known in the art, e.g. by 
site-directed mutagenesis, by random mutagenesis, or by doped, spiked, or localized random 
mutagenesis. 

Random mutagenesis is suitably performed either as localized or region-specific 

. . . 

io random mutagenesis in at least three p^arts Of the gene translating to the* amino acid sequence 
shown in question, or within the wholejgene. When the mutagenesis is-'performed by the use 
of an oligonucleotide, the oligonucleotide may be doped or spiked With'Jthe three non-parent 
nucleotides during the synthesis of Ihe oligonucleotide at the positions which are to be 
changed. The doping or spiking may be performed so that codons for unwanted amino acids 

S5 are avoided. The doped or spiked oligonucleotide can be incorporated into the DNA encoding 
the protease enzyme by any technique, using, e.g., PCR. LCR or any DNA polymerase and 
ligase as deemed appropriate. 1 



15 



i 

k * 



Best Available Copy 



10508.000.DK 



Preferably, the doping is carried out using "constant random doping", in which the 
percentage of wild-type and mutation in each position is predefined. Furthermore, the doping 
may be directed toward a preference for the introduction of certain nucleotides, and thereby a 
preference for the introduction of one or morei specific amino acid residues. The doping may 
5 be made, e.g., so as to allow for the introduction of 90% wild type and 10% mutations in each 
position. An additional consideration in ; the choice of a doping scheme is* based on genetic as 
well as protein-structural constraints. ? ! >P 

The random mutagenesis may be advantageously localized toiia part of the parent 
protease in question. This may, e.g., be advantageous when certain regions of the enzyme 
10 have been identified to be of particular importance for a given property qffthe enzyme. 

Alternative methods for providing variants of the invention include gene shuffling e.g. 
as described in WO 95/22625 or in WO 96/00343, and the consensus derivation process as 
described in EP 697985. 

The present invention does not relate to the following nucleic acid sequences: 
15 (i) Nucleotides 900-1466 of SEQ ID NO: 1. nucleotides 499-1062 of SEQ ID NO: 3, 

nucleotides 496-1059 of SEQ ID NO: 5, nucleotides 496-1059 of SEQ ID NO: 7, and 
nucleotides 502-1065 of SEQ ID NO: 9; 

(ii) the nucleic acid sequence encoding the mature peptide part of the protease derived 
from Nocarxiiopsis dassonvillei NRRL 18133, in the event that this protease has at least 60% 
20 identity to SEQ ID NO: 2; ; • ip i 

Sip* 



the nucleic acid sequence ehcoding the mature peptide part qf the protease derived 

i : 4TI 

from Nocardiopsis sp. FERM P-10508, in the event that this protease has at least 60% identity 
to SEQ ID NO: 2; I V j 

(iv ) the nucleic acid sequence encoding the mature peptide r part of the protease 

* • * '. 

25 derived from Nocardiopsis dassonvillei Wain ZIMET 43647; and/or ; 

(v) nucleic acid sequences encpding any prior art proteases of at least 60% identity to 

■ • * 

SEQ ID NO: 2. 

Nucleic Acid Constructs 

30 A nucleic acid construct comprises a nucleic acid sequence of the present invention 

operably linked to one or more control sequences which direct the expression of the coding 
sequence in a suitable host cell undbr conditions compatible with the control sequences. 
Expression will be understood to include any step involved in the production of the polypeptide 
including, but not limited to, transcription, post-transcriptional modification, translation, post- 

35 translational modification, and secretion. ! J;L % 

Expression vector ; 

\ Ur 
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A nucleic add sequence encoding a protease variant of the invention can be 
expressed using an expression vector which typically includes control sequences encoding a 
promoter, operator, ribosome binding site, translation initiation signal, and, optionally, a 
repressor gene or various activator genes. 
5 The recombinant expression vector carrying the DNA sequence encoding a protease 

variant of the invention may be any vector which may conveniently be subjected to 
recombinant DNA procedures, and the* choice of vector will often depenfj on the host cell into 

which it is to be introduced. The vector may be one which, when introduced into a host cell, is 

.v. '■ . 

integrated into the host cell genome ,and replicated together with the. chromosome(s) into 
10 which it has been integrated. • ■■} > 

The protease variant may also be co-expressed together witli at least one other 
enzyme of animal feed interest, such as a phytase, a galactanase, a xylanase, an 
endoglucanase, an endo-1,3(4)-beta-g : lucanase, an alpha-galactosidase!, and/or a protease. 
The enzymes may be co-expressed from different vectors, from one vector, or using a mixture 
15 of both techniques. When using different vectors, the vectors may have different selectable 
markers, and different origins of replication. When using only one vector, the genes can be 
expressed from one or more promoters. If cloned under the regulation of one promoter (di- or 
multi-cistronic), the order in which the genes are cloned may affect the expression levels of the 
proteins. The protease variant may also be expressed as a fusion protein, i.e. that the gene 
20 encoding the protease variant has been fused in frame to the gene encoding another protein. 

This protein may be another enzyme or a functional domain from another enzyme. 

: ' M 

• »•. 

; • •.(•• * 

Host Cells ; 

The present invention also relates to recombinant host cells, comprising a nucleic acid 
25 sequence of the invention, which are advantageously used in the recombinant production of 
the polypeptides. A vector comprising; a nucleic acid sequence of therpresent invention is 
introduced into a host cell so that the vector is maintained as a chromosomal integrant or as a 
self-replicating extra-chromosomal vector. The term "host cell" encompasses any progeny of a 
parent cell that is not identical to the parent cell due to mutations that occur during replication. 
30 The choice of a host cell will to a large extent depend upon the gene encoding the polypeptide 
and its source. 

* 

The host cell may be a unicellular microorganism, e.g., a prokaryote, or a non- 
unicellular microorganism, e.g., a eukaryote cell, such as an animal, a mammalian, an insect, 
a plant, or a fungal cell. Preferred animal cells are non-human animal cells. 

4 

35 In a preferred embodiment, the host cell is a fungal cell, or a yeast cell, such as a 

Candida, Hansenula, Kluyveromyces,' Pichia, Saccharomyces, Schizosaccharornyces, or 
Yarrowia cell. The fungal host cell may be a filamentous fungal cell, suchias a cell ofta species 

j 17 ; . |! 
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of, but not limited to, Acremonium, Aspergillus, Fusarium, Humicola, mcor, Myceliophthora, 
Neurospora, Penicillium, Thielavia, Tolypocladium, or Trichoderma. Useful unicellular cells are 
bacterial cells such as gram positive bacteria including, but not limited to, a Bacillus cell, e.g., 
Bacillus alkalophilus. Bacillus amyloliquefaciens, Bacillus brevis, Bacillus circulans, Bacillus 
clausii. Bacillus coagulans, Bacillus lautus, Bacillus lentus, Bacillus licheniformis, Bacillus 
megaterium, Bacillus stearothermophilus, Bacillus subtilis, and Bacillus thuringiensis, or a 
Streptomyces cell, such as Streptomyces lividans or Streptomyces murinus, or a Nocardiopsis 
cell, or cells of lactic acid bacteria 5 ; or gram negative bacteria such as E. coli and 
Pseudomonas sp. Lactic acid bacteria' include, but are not limited to, species of the genera 
Lactococcus, Lactobacillus, Leuconostqc, Streptococcus, Pediococcus, and Enterococcus. 

%\ 

Methods of Production j , «• 

The present invention also relates to methods for producing a protease variant of the 
present invention comprising (a) cultivating . a host cell under conditions conducive for 
production of the protease variant; and '(b) recovering the protease variant 

In the production methods of the present invention, the cells are cultivated in a nutrient 
medium suitable for production of the polypeptide using methods known in the art. For 
example, the cell may be cultivated by shake flask cultivation, small-scale or large-scale 
fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory 
or industrial fermentors performed in a suitable medium and under conditions allowing the 
polypeptide to be expressed and/or isolated. The cultivation takes place in a suitable nutrient 
medium comprising carbon and nitrogen sources and inorganic salts, using procedures known 
in the art. Suitable media are available from commercial suppliers or may be prepared 
according to published compositions t (e.g., in catalogues of the Amprican Type Culture 
Collection). If the protease is secreted into the nutrient medium, it canj^e recovered directly 
from the medium. If it is not secreted, it can be recovered from cell lysat$s. 

The resulting protease may be fecovered by methods known in the art. For example, it 

: J 'X* 

can be recovered from the nutrient medium by conventional procedures including, but not 

i it-, 
limited to, centrifugation, filtration, extraction, spray-drying, evaporation, or precipitation. 

The proteases of the present invention may be purified by a :variety of procedures 

known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, 

hydrophobic, chromatofocusing, and* size exclusion), electrophoretic procedures (e.g., 

t 

preparative isoelectric focusing), differential solubility (e.g., ammonium sulfate precipitation), 
SDS-PAGE, or extraction (see, e.g., Protein Purification, J.-C. Janson and Lars Ryden, 
editors, VCH Publishers, New York, 1989). 
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The present Invention also relates to a transgenic plant, plant pjart. or plant cell which 
has been transformed with a nucleic aicid sequence encoding a polypeptide having protease 
activity of the present invention so as to express and produce the polyj&ptide in recoverable 
quantities. The polypeptide may be recovered from the plant or plant part. Alternatively, the 
s plant or plant part containing the recombinant polypeptide may be used, as such for improving 
the quality of a food or feed, e.g., improving nutritional value, payability, and rheological 
properties, or to destroy an antinutritiye factor. 

In a particular embodiment, the polypeptide is targeted to the endosperm storage 
vacuoles in seeds. This can be obtained by synthesizing it as a precursor with a suitable signal 
10 peptide, see Horvath et al in PNAS, Feb. 15, 2000, vol. 97, no. 4, p. 1914-1919. 

The transgenic plant can be dicotyledonous (a dicot) or monocotyledonous (a 
monocot) or engineered variants thereof. Examples of monocot plants ^are grasses, such as 
meadow grass (blue grass, Poa), forage grass such as festuca, lolium, temperate grass, such 
as Agrostis, and cereals, e.g., wheat, oats, rye, barley, rice, sorghum, arjd, maize (com). 
15 Examples of dicot plants are tobacco, legumes, such as lupins, potato, sugar beet, 

pea, bean and soybean, and cruciferous plants (family Brassicaceaeigsuch as cauliflower, 
rape seec j i anc j the closely related model organism Arabidopsis thaliana. Low-phytate plants 
as described e.g. in US patent no. 5,689,054 and US patent no. 6, 111$ 68 are examples of 
engineered plants. Examples of plant parts are stem, callus, leaves, root, fruits, seeds, and 
20 tubers. Also specific plant tissues, such as chloroplast, apoplast, mitochondria, vacuole, 
peroxisomes, and cytoplasm are considered to be a plant part. Furthermore, any plant cell, 
whatever the tissue origin, is considered to be a plant part. 

Also included within the scope of the present invention are the progeny of such 

plants, plant parts and plant cells. 
25 The transgenic plant or plant cell expressing a polypeptide of the present invention 

» ■ 

may be constructed in accordance with methods known in the art. Briefly, the plant or plant cell 
is constructed by incorporating one or more expression constructs encoding a polypeptide of 
the present invention into the plant host genome and propagating the resulting modified plant 
or plant cell into a transgenic plant or plant cell. , . 

30 Conveniently, the expression construct is a nucleic acid construct which comprises a 

nucleic acid sequence encoding a polypeptide of the present invention- operably linked with 
appropriate regulatory sequences required for expression of the nuclekj*acid sequence in the 
plant or plant part of choice. Furthermore, the expression construct may comprise a selectable 
marker useful for identifying host cells into which the expression construct has been integrated 

35 and DNA sequences necessary for introduction of the construct into the plant in question (the 
latter depends on the DNA introduction method to be used). 

The choice of regulatory sequences, such as promoter and terminator sequences and 
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optionally signal or transit sequences are determined, for example, oVj the basis of when, 
where, and how the polypeptide is desired to be expressed. For instance, the expVessibn of 
the gene encoding a polypeptide of the present invention may be constitutive or inducible, or 
may be developmental, stage or tissue specific, and the gene productjjmay be targeted to a 
specific tissue or plant part such as sefeds or leaves. Regulatory sequences are, for example, 
described by Tague et a/., 1988, Plant Physiology 86: 506. h 

I - ■ 

For constitutive expression, tfie 35S-CaMV promoter may be used (Franck et a/ M 
1980, Cell 21: 285-294). Organ-specific promoters may be, for example, a promoter from 
storage sink tissues such as seeds, potato tubers, and fruits (Edwards & Coruzzi, 1 990, Ann. 
Rev. Genet 24: 275-303), or from metabolic sink tissues such as meristems (Ito et a/ M 1994, 
Plant Mol, Biol 24: 863-878), a seed specific promoter such as the glutelin, prolamin, globulin, 
or albumin promoter from rice (Wu ef a/., 1998, Plant and Cell Physiology 39: 885-889), a 
Vicia faba promoter from the legumin B4 and the unknown seed protein gene from Vicia faba 
(Conrad et a/., 1998, Journal of Plant Physiology 152: 708-711), a promoter from a seed oil 
body protein (Chen et a/., 1998, Plant and Cell Physiology 39: 935-941), the storage protein 
napA promoter from Brassica napus, or any other seed specific prompter known in the art, 
e.g., as described in WO 91/14772. Furthermore, the promoter may be a leaf specific 
promoter such as the rbcs promoter from rice or tomato (Kyozukg. et a/., 1993, Plant 
Physiology 102: 991-1000, the chlorell^ virus gdenine methyltransferas^jgene promoter (Mitra 
and Higgins, 1994, Plant Molecular Biology 26: 85-93), or the aldP gerife promoter from rice 
(Kagaya et a!., 1995, Molecular and general Genetics 248: 668-674)/;or a wound inducible 
promoter such as the potato pin2 promoter (Xu et a/. f 1993, Plant Molecular Biology 22: 573- 
588). 

A promoter enhancer element may also be used to achieve higher expression of the 

4 

enzyme in the plant. For instance, the promoter enhancer element may be an intron which is 
placed between the promoter and the nucleotide sequence encoding a polypeptide of the 
present invention. For instance, Xu ef a/., 1993, supra disclose the use of the first intron of the 

■ ■ 

rice actin 1 gene to enhance expression. 

Still further, the codon usage may be optimized for the plant species in question to 
improve expression (see Horvath et al referred to above). i*j 

The selectable marker gene and any other parts of the expression construct may be 
chosen from those available in the art. I » if 

The nucleic acid construct te incorporated into the plant genome according to 
conventional techniques known in the art, including Agrobacterium-me&\p\ed transformation, 
virus-mediated transformation, microinjection, particle bombardment, biolistic transformation, 
and electroporation (Gasser et a/,, 1990, Science 244: 1293; Potrykus, 1990, Bio/Technology 
8: 535; Shimamoto et al., 1989, Nature ?38: 274). 

20 
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Presently, Agrobacterium fume/ac/ens-mediated gene transfer is the method of 
choice for generating transgenic dicots (for a review, see Hooykas and Schilperoort, 1992, 
Plant Molecular Biology 19: 15-38). However it can also be used for transforming monocots, 
although other transformation methods! are generally preferred for thes^plants. Presently, the 

5 method of choice for generating transgenic monocots is particle bombardment (microscopic 
gold or tungsten particles coated with (he transforming DNA) of embry^jjiic calli or developing 
embryos (Christou, 1992, Plant Jouipal 2: 275-281; Shimamoto, ^94, Current Opinion 
Biotechnology 5: 158-162; Vasil et a/., 1992, Bio/Technology 10: 66j7r674). An alternative 
method for transformation of monocots is based on protoplast transformation as described by 

10 Omirulleh et a/., 1 993, Plant Molecular Biology 21 : 41 5-428. 

Following transformation, the transformants having incorporated therein the 
expression construct are selected and regenerated into whole plants according to methods 
well-known in the art. 

The present invention also relates to methods for producing a polypeptide of the 
15 present invention comprising (a) cultivating a transgenic plant or a plant cell comprising a 
nucleic acid sequence encoding a protease variant of the present invention under conditions 
conducive for production of the protease variant; and (b) recovering the protease variant. 

i ■ : \f i 

Animals as Expression Hosts M 

20 The present invention also relates to a transgenic, non-human ahimal and products or 

elements thereof, examples of which kre body fluids such as milk and; blood, organs, flesh, 
and animal cells. Techniques for expressing proteins, e.g. in mammalian cells, are known in 
the art, see e.g. the handbook Protein! Expression: A Practical Approach, Higgins and Hames 
(eds), Oxford University Press (1999), and the three other handbooks iri ;this series relating to 

25 Gene Transcription, RNA processing, and Post-translational Processing. Generally speaking, 
to prepare a transgenic animal, selected cells of a selected animal are transformed with a 
nucleic acid sequence encoding a protease variant of the present invention so as to express 
and produce the protease variant. The protease variant may be recovered from the animal, 
e.g. from the milk of female animals, or it may be expressed to the benefit of the animal itself, 

30 e.g. to assist the animal's digestion. Examples of animals are mentioned below in the section 
headed Animal Feed and Animal Feed Additives. , ' 

To produce a transgenic animal with a view to recovering the protease variant from the 
milk of the animal, a gene encoding the protease variant may be inserted into the fertilized 
eggs of an animal in question, e.g. by use of a transgene expression Wctor which cpmprises a 

35 suitable milk protein promoter, and the gene encoding the proteasie Variant. The transgene 
expression vector is microinjected into) fertilized eggs, and preferably ^prmanently integrated 
into the chromosome. Once the eggj begins to grow and divide, ttjp[ potential embryo is 
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implanted into a surrogate mother, and animals carrying the transgene are identified. The 
resulting animal can then be multiplied by conventional breeding. The protease variant may be 
purified from the animal's milk, see e.g.. Meade, H.M. et al (1999): Expression of recombinant 
proteins in the milk of transgenic animals, Gene expression systems: Using nature for the art 
5 of expression. J. M. Fernandez and J. P. Hoeffler (eds.), Academic Press. 

In the alternative, in order to produce a transgenic non-human' animal that carries in 
the genome of its somatic and/or germ' cells a nucleic acid sequence including a heterologous 
transgene construct including a transgfcne encoding the protease variant, the transgene may 
be operably linked to a first regulatory: sequence for salivary gland specific expression of the 

10 protease variant, as disclosed in WO 2000064247. \». 

i \L 

• ; 

I t ! 

Animal Feed and Animal Feed Additives 

For the present purposes, the term animal includes all animals, including human 
beings. In a particular embodiment, the protease variants and compositions of the invention 

15 can be used as a feed additive for non-human animals. Examples of animals are non- 
ruminants, and ruminants, such as cows, sheep and horses. In a particular embodiment, the 
animal is a non-ruminant animal. Non-ruminant animals include mono-gastric animals, e.g. 
pigs or swine (including, but not limited to, piglets, growing pigs, and sows); poultry such as 
turkeys, ducks and chicken (including but not limited to broiler chicks, layers); young calves; 

20 and fish (including but not limited to salmon, trout, tilapia, catfish and carps; and crustaceans 
(including but not limited to shrimps and prawns). \y 

The term feed or feed composition means any compound, preparation, mixture, or 

j * *Jj 

composition suitable for, or intended for intake by an animal. Thefe)|d can be fed to the 

animal before, after, or simultaneously with the diet. The latter is preferrsg. , 
25 The composition of the invention, when intended for addition tqj"animal feed, may be 

designated an animal feed additive. Such additive always comprises the protease variant in 
question, preferably in the form of stabilized liquid or dry compositions. The additive may 
comprise other components or ingredients of animal feed. The so-called pre-mixes for animal 
feed are particular examples of such animal feed additives. Pre-mixes may contain the 
30 enzyme(s) in question, and in addition at least one vitamin and/or at least one mineral. 

Accordingly, in a particular embodiment, in addition to the component polypeptides, the 
composition of the invention may comprise or contain at least one fat-soluble vitamin, and/or 
at feast one water-soluble vitamin, and/or at least one trace mineral. Also at least one macro 
mineral may be included. \, 
35 Examples of fat-soluble vitamins are vitamin A, vitamin D3, vitamin E, and vitamin K, 

e.g. vitamin K3. 



Examples of water-soluble vitamins are vitamin B12, biotin and}, choline, vitamin B1 

: < • & . 
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vitamin B2, vitamin B6, niacin, folic acid and pantothenate, e.g. Ca-D-pahthothenate. 

Examples of trace minerals are manganese, zinc, iron, copper, iodine, selenium, and 

cobalt. 

Examples of macro minerals are calcium, phosphorus and sodium. 
5 Further, optional, feed-additive ingredients are colouring agents, aroma compounds, 

» 

stabilizers, additional enzymes, and antimicrobial peptides. 

Additional enzyme components .of the composition of the invention include at least one 
polypeptide having xylanase activity; and/or at least one polypeptide paving endoglucanase 
activity; and/or at least one polypeptide having endo-1,3(4)-beta-glucaiJ?ase activity; and/or at 

10 least one polypeptide having phytase activity; and/or at least on & polypeptide having 
galactanase activity; and/or at least on^ polypeptide having alpha-galact^sidase activity. 

Xylanase activity can be measured using any assay, in which a substrate is employed, 
that includes 1 ,4-beta-D-xylosidic endo-linkages in xylans. Different types of substrates are 
available for the determination of xylanase activity e.g. Xylazyme cross-linked arabinoxylan 

15 tablets {from MegaZyme), or insoluble powder dispersions and solutions of azo-dyed 

i 

arabinoxylan. 

Endoglucanase activity can be determined using any endoglucanase assay known in 
the art. For example, various cellulose- or beta-glucan-containing substrates can be applied. 
An endoglucanase assay may use AZCL-Barley beta-GIucan, or preferably (1) AZCL-HE- 
20 Cellulose, or (2) Azo-CM-cellulose as a substrate. In both cases, the degradation of the 
substrate is followed spectrophotometrically at OD595 (see the Megazyme method for AZCL- 
polysaccharides for the assay of endo-hydrolases at http://www.megazyme.com/book- 

lets/AZCLPOL.pdf. ; »£« 

Endo-1,3(4)-beta-glucanase activity can be determined using .any endo-1 (3(4)-beta- 

i • * /ri 

25 glucanase assay known in the art. A preferred substrate for enpoj (1 ,3(4)-beta-0lucanase 
activity measurements is a cross-linked azo-coloured beta-glucan Barley substrate, wherein 
the measurements are based on spectrophotometric determination principles. 

Phytase activity can be measured using any suitable assay, '; e.g. the FYT assay 
described in Example 4 of WO 98/28408. 

30 Galactanase can be assayed 6.g. with AZCL galactan from Megazyme, and alpha- 

galactosidase can be assayed e.g. with pNP-alpha-galactoside. 

For assaying these enzyme activates the assay-pH and the assay-temperature are to 

« 

be adapted to the enzyme in question (preferably a pH close to the optimum pH, and a 
temperature close to the optimum temperature). A preferred assay pH is in the range of 2-10, 
35 preferably 3-9, more preferably pH 3 or 4 or 5 or 6 or 7 or 8, for example pH 3 or pH 7. A 
preferred assay temperature is in the range of 20-90'C, preferably 30-90 o C, more preferably 
40-80°C, even more preferably 40-70°C, preferably 40 or 45 or 50°C. The enzyme activity is 
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defined by reference to appropriate blinds, e.g. a buffer blind. 

Examples of antimicrobial peptides (AMP's) are CAP 18, LeucOcin A, Tritrpticin, Pro- 
tegrin-1, Thanatin, Defensin, Lactoferrin, Lactoferricin, and Ovispirih such as Novispirin 
(Robert Lehrer, 2000), Plectasins, and Statins, including the compounds and polypeptides 
5 disclosed in WO 03/044049 and PCT7DK02/00812, as well as variants or fragments of the 
above that retain antimicrobial activity. 

Examples of antifungal polypeptides (AFP's) are the Aspergillus giganteus, and As- 
pergillus niger peptides, as well as variants and fragments thereof which retain antifungal ac- 
tivity, as disclosed in WO 94/01459 and WO 02/090384. 
10 Usally fat and water soluble vitamins, as well as trace minerals form part of a so-called 

premix intended for addition to the feed, whereas macro minerals are usually separately added 

to the feed. A premix enriched with a protease of the invention, is an example of an animal 

- . 
feed additive of the invention. ; -r 

In a particular embodiment, the animal feed additive of the invention is intended for 
15 being included (or prescribed as having to be included) in animal diets or feed at levels of 0.01 
to 10.0%; more particularly 0.05 to 5.0%; or 0.2 to 1.0% (% meaning g additive per 100 g 
feed). This is so in particular for premixes. 

The nutritional requirements bf these components (exemplified with poultry and 
piglets/pigs) are listed in Table A of WO 01/58275. Nutritional requirement means that these 
20 components should be provided in the diet in the concentrations indicated. 

In the alternative, the animal feed additive of the invention comprises at least one of 

the individual components specified in Table A of WO 01/58275. At least one means either of, 

i : 

one or more of, one, or two, or three, or four and so forth up to all thirteen, or up to all fifteen 

individual components. More specifically, this at least one individual component is included in 

25 the additive of the invention In such ah amount as to provide an in-fee<i-concentration within 

» . * U\ 
the range indicated in column four, or column five, or column six of Table A. 

The present invention also relates to animal feed compositions. Animal feed 
compositions or diets have a relatively high content of protein. Poultry^and pig diets can be 
characterised as indicated in Table B of WO 01/58275, columns 2-3. Fish diets can be 
30 characterised as indicated in column h of this Table B. Furthermore such fish diets usually 
have a crude fat content of 200-310 g/kg. WO 01/58275 corresponds to j US 09/779334 which 
is hereby incorporated by reference. 

An animal feed composition according to the invention has a crude protein content of 
50-800 g/kg, and furthermore comprises at least one protease variant as claimed herein. 
35 Furthermore, or in the alternative (to the crude protein content indicated above), the 

* * 

animal feed composition of the invention has a content of metabolisable energy of 10-30 
MJ/kg; and/or a content of calcium of 0.1-200 g/kg; and/or a content of available phosphorus 

! •; : 
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of 0.1-200 g/kg; anaVor a content of methionine of 0.1-100 g/kg; ;;and/or a content of 
methionine plus cysteine of 0.1-150 g/kjg; and/or a content of lysine of Q-5-50 g/kg. 

In particular embodiments, the content of metabolisable energy, cljude protein, calcium, 
phosphorus, methionine, methionine plus cysteine, and/or lysine is withiif any one of ranges 2, 
5 3, 4 or 5 in Table B of WO 01/58275 (ft 2-5). l) 

Crude protein is calculated as* nitrogen (N) multiplied by a factor 6.25, i.e. Crude 
protein (g/kg)= N (g/kg) x 6.25. The riitrogen content is determined by the Kjeldahl method 
(A.O.A.C., 1984, Official Methods of Analysis 14th ed., Association of Official Analytical 
Chemists, Washington DC). 
10 Metabolisable energy can be calculated on the basis of the NRC publication Nutrient 

requirements in swine, ninth revised edition 1988, subcommittee on swine nutrition, committee 
on animal nutrition, bo^rd of agriculturb, national research council. National Academy Press, 
Washington, D.C., pp. 2-6, and the European Table of Energy Values for Poultry Feed-stuffs, 
Spelderholt centre for poultry research and extension, 7361 D# Beekbergen, The 
15 Netherlands. Grafisch bedrijf Ponsen &1ooijen bv, Wageningen. ISBN 90 r 7 1463- 12-5. 

The dietary content of calcium;, available phosphorus and gmjrio acids in complete 
animal diets is calculated on the basis of feed tables such as Veevoedertabel 1997, gegevens 
over chemische samenstelling, verteerbaarheid en voederwaarde jvan voedermiddelen, 
Central Veevoederbureau, Runderweg 6, 8219 pk Lelystad. ISBN 90-72839-13-7. 
20 In a particular Embodiment, the animal feed composition of thd Invention contains at 

least one vegetable protein or protein source as defined above. 

In still further particular embodiments, the animal feed composition of the invention 
contains 0-80% maize; and/or 0-80% sorghum; and/or 0-70% wheat; and/or 0-70% Barley; 
and/or 0-30% oats; and/or 0-40% soybean meal; and/or 0-10% fish meal; and/or 0-20% whey. 
25 Animal diets can e.g. be manufactured as mash feed (non pelleted) or pelleted feed. Typically, 
the milled feed-stuffs are mixed and sufficient amounts of essential vitamins and minerals are 
added according to the specifications for the species in question. Enzymes can be added as 
solid or liquid enzyme formulations. For example, a solid enzyme formulation is typically added 
before or during the mixing step; and a liquid enzyme preparation is typically adde£ after the 
30 pelleting step. The enzyme may also b€ incorporated in a feed additive pr^premix. 

The final enzyme concentration: in the diet is within the range of $.01-200 mg enzyme 
protein per kg diet, for example in the rsinge of 0.5-25 mg enzyme proteih|per kg animal diet. 

The protease variant should of; course be applied in an effectj|/<e amount/ i.e. in an 
amount adequate for improving solubilization and/or improving nutritional $ value of feed. It is at 
35 present contemplated that the enzyme fe administered in one or more of the following amounts 
(dosage ranges): 0.01-200; 0.01-100; 0.5-100; 1-50; 5-100; 10-100; 0.05-50; or 0.10-10 - all 
these ranges being in mg protease enzyme protein per kg feed (ppm). 

25 
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For determining mg enzyme protein per kg feed, the protease is purified from the feed 
composition, and the specific activity of the purified protease is determined using a relevant 
assay (see under protease activity, substrates,; and assays). The prqtea'se activity of the feed 
composition as such is also determined using the same assay, and on $ie basis of these two 
determinations, the dosage in mg enzyme protein per kg feed is calculated . 

The same principles apply for jdetermlning mg enzyme proteirjjjn feed additives. Of 
course, if a sample is available of the! protease used for preparing th& feed additive or the 
feed, the specific activity is determined! from this sample (no need to purify the protease from 
the feed composition or the additive); \ 

Detergent Compositions 

The protease variant of the invention may be added to and thus become a component 

of a detergent composition. 

The detergent composition of the invention may for example be formulated as a hand 
or machine laundry detergent composition including a laundry additive composition suitable for 
pre-treatment of stained fabrics and* a rinse added fabric softener;/ composition, or be 
formulated as a detergent composition for use in general household l£ard surface cleaning 
operations, or be formulated for hand or machine dishwashing operation^: 

In a specific aspect, the invention provides a detergent additive comprising the 
protease variant of the invention. The detergent additive as well as the j^etergent composition 
may comprise one or more other enzymes such as another proteose, such as alkaline 
proteases from Bacillus, a lipase, a putinase, an amylase, a carbohydrase, a cellulase, a 
pectinase, a mannanase, an arabinase, a galactanase, a xylanase, an oxidase, e.g., a 

laccase, and/or a peroxidase. * 

In general the properties of, the chosen enzyme(s) should be compatible with the 
selected detergent, (i.e. pH-optimum, compatibility with other enzymatic and non-enzymatic 
ingredients, etc.), and the enzyme(s) should be present in effective amounts. 

Suitable lipases include those of bacterial or fungal origin. Chemically modified or 
protein engineered mutants are included. Examples of useful lipases include lipases from 
Humicola (synonym Thermomyces), e.g. from H. lanuginosa (7*. lanuginpsus) as described in 
EP 258068 and EP 305216 or frqm H. insotens as described jfh WO 96/13580, a 
Pseudomonas lipase,: e.g. from P. alcaligenes or P. pseudoaicaljgfr)es (EP 218272), P. 
cepacia (EP 331376), P. stutzeri (GB 11,372,634), P. fluorescens, Pseudomonas sp. strain SD 
705 (WO 95/06720 and WO 96/27002), P. wisconsinensis (WO 96/12^1 2), a Bac///us lipase, 
e.g. from 6. subtUls (Dartois et al. (1993), Biochemica et Biophysica Acjta, 1131, 253-360), 8. 
stearothermophilus (JP 64/744992) or b. pumilus (WO 91/16422). Othe^ examples are lipase 
variants such as those described in WO 92/05249, WO 94/01541, EP 407225, EP 260105, 

_ ■ 
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WO 95/35381, WO 96/00292, WO 95/30744, WO 94/25578, WO 95/14783, WO 95/22615, 
WO 97/04079 and WO 97/07202. Preferred commercially available lipase enzymes include 
LipolaseTM and Lipolase UltraTM (Novozymes A/S). 

Suitable amylases (alpha- and/or beta-) include those of bacterial or fungal origin. 
5 Chemically modified or protein engineered mutants are included, ^myiases include, for 
example, alpha-amylases obtained frpm Bacillus, e.g. a special strajfj of B. licheniformis, 
described in more detail in GB 1.296,839. Examples of useful amyjj|&es are the variants 
described in WO 94/02697, WO 94/18314, WO 96/23873. and WO : 97|43424, especially the 
variants with substitutions in one or mote of the following positions: 15, 23, 105, 106,' 124. 128, 

10 133, 154, 156, 181, 188. 190, 197, 202, 208, 209. 243, 264, 304, 30s'j 391, 408, and 444. 
Commercially available amylases are j DuramylTM, TermamylTM, FungamylTM and BANTM 
(Novozymes A/S), RapidaseTM and PurastarTM (from Genencor International Inc.). 

Suitable cellulases include those of bacterial or fungal origin. Chemically modified or 
protein engineered mutants are included. Suitable cellulases include cellulases from the 

15 genera Bacillus, Pseudomonas, Humicola, Fusarium. Thielavia, Acremonium, e.g. the fungal . 
cellulases produced from Humicola insolens t Myceliophthora thermophila and Fusarium 
oxysporum disclosed in US 4,435,307,' US 5,648,263, US 5,691,178, US 5,776,757 and WO 
89/09259. Especially suitable cellulases are the alkaline or neutral cellulases having colour 
care benefits. Examples of such cellulases are cellulases described m EP 0 4d5257, EP 

20 531372, WO 96/11262, WO 96/29397, WO 98/08940. Other examplesbare cellulase variants 

' ' -fit J 

such as those described in WO 94/07998, EP 0 531 315, US 5,457,046, US 5,686,593, US 
5,763,254, WO 95/24471, WO 98/12307 and WO 99/01544. Commercially: available 
cellulases include CelluzymeTM, and; CarezymeTM (Novozymes A/sj:, ClazinaseTM, and 
Puradax HATM (Genencor International Inc.), and KAC-500(B)TM (Kao Corporation). 
25 Suitable peroxidases/oxidases j include those of plant, bacterial or fungal origin. 

Chemically modified or protein engineered mutants are included. Examples of useful 
peroxidases include peroxidases from Coprinus, e.g. from C. cinereus, and variants thereof as 
those described in WO 93/24618, WO 95/10602, and WO 98/15257. Commercially available 

■ 

peroxidases include GuardzymeTM (Novozymes). 
30 The detergent enzyme(s) may be included in a detergent composition by adding 

separate additives containing one or more enzymes, or by adding g combined additive 
comprising all of these enzymes. A detergent additive of the invention, i.e. a separate additive 
or a combined additive, can be formulated e.g. as a granulate, a liquid, a -slurry, etc., Preferred 

• i 

detergent additive formulations are grahutates, in particular non-dusting' 'granulates,! liquids, in 

. j * « r M f 

particular stabilized liquids, or slurries, j j jjjj' • 

Non-dusting granulates may be produced, e.g., as disclosed !iH US 4,106,991 and 
4,661,452 and may optionally be coated by methods known in the a!r$. Examples of waxy 
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coating materials are polyethylene oxide) products (polyethyleneglycol, PEG) with mean 
molar weights of 1000 to 20000; ethoxylated nonylphenols having from 16 to 50 ethylene 
oxide units; ethoxylated fatty alcohols in which the alcohol contains from 12 to 20 carbon 
atoms and in which there are 15 to 80 ethylene oxide units; fatty alcohols; fatty acids; and 
5 mono- and di- and triglycerides of fatty acids. Examples of film-forming coating materials 
suitable for application by fluid bed techniques are given in GB 14^591. Liquid enzyme 
preparations may, for instance, be stabilized, by adding a poiyol such a£ propylene glycol, a 
sugar or sugar alcohol, lactic acid or boric acid according to establish^ methods.? Protected 
enzymes may be prepared according tc| the method disclosed in EP 2382 jl 6. 

10 The detergent composition of the invention may be in any convenient form, e.g., a bar, 

a tablet, a powder, a granule, a paste or a liquid. A liquid detergent may^be aqueous, typically 
containing up to 70 % water and 0-30 & organic solvent, or non-aqueouS:' 

The detergent composition comprises one or more surfactants, which may be non-ionic 
including semi-polar and/or anionic and/or cationic and/or zwitterionic. The surfactants are 

1 5 typically present at a level of from 0. 1 % to 60% by weight. 

When included therein the detergent will usually contain from about 1% to about 40% 
of an anionic surfactant such as linear alkylbenzenesulfonate, alpha-olefinsulfonate, alkyl 
sulfate (fatty alcohol sulfate), alcohol ethoxysulfate, secondary alkanesulfonate, alpha-sulfo 

* 

fatty acid methyl ester; alkyl- or alkenylsuccinic acid or soap. 

20 When included therein the detergent will usually contain from about 0.2% to about 40% 

of a non-ionic surfactant such ; as alcohol ethoxylate, nonylphenol ethoxylate, 
alkylpolyglycoside, alkyldimethylamineoxide, ethoxylated fatty acid moftoethanolamide, fatty 
acid monoethanolamide, polyhydroxy alkyl fatty acid amide, or N-apyi;]N-alkyl derivatives of 
glucosamine ("glucamides"). ( jjj . 

25 The detergent may contain 0-65 % of a. detergent builder or cbmpjexing agent such as 

zeolite, diphosphate, triphosphate, Rhosphonate, carbonate, citrate;,*.; nitrilotriacetic acid, 

i i * 

ethylenediaminetetraacetic acid, diethylenetriaminepentaacetic acid, alkyl- or alkenylsuccinic 
acid, soluble silicates or layered silicates (e.g. SKS-6 from Hoechst). 

The detergent may comprise one or more polymers. Examples are 
30 carboxymethylcellulose, poly(vinylpyrrolidone), poly (ethylene glycol), polyvinyl alcohol), 
poly(vinylpyridine-N-oxide), poly(vinylimidazole), polycarboxylates such as polyacrylates, 
maleic/acrylic acid copolymers and lauryl methacry late/acrylic acid copolymers. 

■ ■ - * 

The detergent may contain a bleaching system which may comprise a H202 source 
such as perborate or percarbonate which may be combined with a peracid-forming bleach 
35 activator such as tetraacetylethylenedijamine or nonanoyloxybenzenesiilfonate. Alternatively, 
the bleaching system may comprise peroxyacids of e.g. the amide, imid^Sorsulfoneiype. 

The enzyme(s) of the detergent composition of the invention nity be stabilized using 
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conventional stabilizing agents, e.g., a polyol such as propylene glycol br glycerol, a sugar or 
sugar alcohol, lactic acid, boric acid, or. a boric acid derivative, e.g., an aromatic borate ester, 
or a phenyl boronic acid derivative such as 4-formylphenyl boronic acid, and the composition 
may be formulated as described in e.g. WO 92/1 9709 and WO 92/1 9708. 
5 The detergent may also contain other conventional detergent ingredients such as e.g. 

fabric conditioners including clays, foam boosters, suds suppressors, anti-corrosion agents, 
soil-suspending agents, anti-soil redeposition agents, dyes, bactericides, optical brighteners, 
hydrotropes, tarnish inhibitors, or perfumes. 

It is at present contemplated 'that in the detergent compositions any enzyme, in 
10 particular the enzyme of the invention, may be added in an amount corresponding to 0.01-100 
mg of enzyme protein 5 per liter of wash Siqour, preferably 0.05-5 mg of enzyme protein per liter 
of wash liqour, in particular 0.1-1 mg of tenzyme protein per liter of wash ligour. 

The enzyme of the invention; may additionally be incorporated in the detergent 
formulations disclosed in WO 97/07202; ? : i 



1 1 



Method for Generating Protease Variants 

The invention also relates to '.a method for generating a protease variant of an 
improved property, the method comprising the following steps: 

(a) selecting a parent protease of at least 60% identity to SEQ ID NO: 2; 
20 (b) establishing a 3D structure of the parent protease by homology modelling using 

the Fig. 2 structure as a model; and/or aligning the parent protease according to the alignment 
of Fig. 1; ; 

(c) proposing at least one amino acid substitution, e.g. by: x - 

(i) subjecting the 3D structure of (b) to MD simulations at increased 
25 temperatures, and identifying regions in the amino acid secjLjpnce of the parent 

protease of high mobility (isotropic fluctuations); : <i . 

(ii) introducing disuifid bridges by way of cysteine substitutions (C-C); 

(iii) introducing proline substitutions (P); ;>'; 

(iv) replacing exposed neutral amino acid residues with negatively charged 
30 amino acid residues (E,D); j 

(v) replacing exposed neutral amino aicd residues with positively charged 

amino acid residues (R,K); 

(vi) replacing small amino acid residues inside the protein with bulkier amino 

acid residues (W); 

35 (vii) comparing by homology alignment and/or homology modelling 

according to step (c)(i) at least two related parent proteases and transferring amino 
acid residue differences inbetween these protease backbones, preferably from a 

i •< : 
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backbone having the improved property to a backbone not having this improved 

property; I iti 1 

(d) preparing a DNA sequence encoding the parent protease but for inclusion of a 
DNA codon of the at least one amino acid substitution proposed in steps (c)<ii)-(c)(vii) ( or 

5 subjecting the parent DNA sequence to random mutagenesis, targetting at least one of the 
regions identified in step (c)(i); 

(e) expressing the DNA sequence obtained in step (d) in a host cell, and 

(h) selecting a host cell expressing a protease variant with an improved property. 
The invention furthermore relates to a method for producing a protease variant 
10 obtainable or obtained by the method of generating protease variants described above, 
comprising (a) cultivating the host cell' to produce a supernatant comprising the variant; and 
(b) recovering the variant. 



« . ; 

*. i- - 

\\\ 



Alternative Embodiment ' * 



15 In an alternative embodiment, the ternV "alteration" is used instead of "substitution" as 

the general term for amendments in 'the protease molecule. This ajternative embodiment 
includes each of the claims formulated as examplified below for claim 1 i and also specifically 
includes everything what is stated herein, e.g. definitions (other than the definition of 

■ 

substitution), i.e. the various aspects, particular embodiments etc. 
20 A variant of a parent protease, comprising an alteration in at least one position of at 

least one region selected from the group of regions consisting of: 

6-18; 22-28; 32-39; 42-58; 62-63; 66-76; 78-100; 103-106; 111-114; 118-131; 134-136; 139- 
141; 144-151; 155-156; 160-176; 179-181; and 184-188; wherein 

(a) the alteration(s) are independently 

25 (i) an insertion of an amino acid immediately downstream of the position, 

(ii) a deletion of the amino acid which occupies the position, find/or 

(iii) a substitution of the amino acid which occupies the position; 

(b) the variant has protease activity;- and 

(c) each position corresponds to a position of SEQ ID NO: 2; and t :. ; 
30 (d) the variant has a percentage of identity to SEQ ID NO: 2 of at least 60%. 

The term -polypeptide variant", ^protein variant", "enzyme variant",/ "protease variant" or 
simply "variant" refers to a polypeptide' of the invention comprising one or more alteration (s), 
such as substjtution(s), insertion(s), deletion(s), and/or truncation(s) of one or more specific 
amino acid residue(s) in one or more specific position(s) in the polypeptide. 
35 The term "parent polypeptide", "parent protein", "parent enzyme", "standard enzyme", 

"parent protease" or simply "parent" refers to the polypeptide on which the variant was based. 
This term also refers to the polypeptide with which a variant is compared and aligned; 

30 
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The term "randomized library", Variant library", or simply "library/: refers to a library of 
variant polypeptides. Diversity in the variant library can be generated via mutagenesis of the 
genes encoding the variants at the; DNA triplet level, such that individual codons are 
variegated e.g. by using primers of partially randomized sequence in a jfpR reaction. Several 
techniques have been described, by Which one can create a diverse combinatorial library by 
variegating several nucleotide positions in a gene and recombining thehi, for instance where 
these positions are too far apart to beicovered by a single (spiked or doped) oligonucleotide 
primer. These techniques include the use of in vivo recombination of the individually diversified 
gene segments as described in WO 97/07205 on page 3, lines 8 to 29 (Novozymes A/S). They 
also include the use of DNA shuffling techniques to create a library of full length genes, 
wherein several gene segments are combined, and wherein each segment may be diversified 
e.g. by spiked mutagenesis (Stemmer, Nature 370. pp. 389-391, 1994 and US 5,811,238; US 
5,605,793; and US 5,830,721). One can use a gene encoding a protein "backbone" (wildtype 
parent polypeptide) as a template polynucleotide, and combine this with^ one or more single or 
double-stranded oligonucleotides as .described in WO 98/41623 aijd in W0^98/41622 
(Novozymes A/S). The single-stranded oligonucleotides could be partially randomized during 
synthesis. The double-stranded oligonucleotides could be PCR products incorporating 
diversity in a specific region. In both cases, one can dilute the divers|y with corresponding 
segments encoding the sequence of the backbone protein in order to limit the average number 

» 

of changes that are introduced. 

# * 

Methods have also been established for designing the ratios of nucleotide mixtures (A; 
C; T; G) to be inserted in specific codoh positions during oligo- or polynucleotide synthesis, so 
as to introduce a bias in order to approximate a desired frequency distribution towards a set of 
one or more desired amino acids that will be encoded by the particular codons. It may be of 
interest to produce a variant library, that comprises permutations of a number of known amino 
acid modifications in different locations in the primary sequence of the polypeptide. These 
could be introduced post-translationally or by chemical modification sites, or they could be 
introduced through mutations in the encoding genes. The modifications by themselves may 
previously have been proven beneficial for one reason or another (e.g. decreasing antigenicity, 
or improving specific activity, performance, stability, or other characteristics)? In such 
instances, it may be desirable first to create a library of diverse combinations of known 
sequences. For example, if twelwe individual mutations are known, orjje could combine (at 
least) twelwe segments of the parent protein encoding gene, wherein eajjn segment is present 
in two forms: one with, and one without the desired mutation. By varyin§'the relative amounts 
of those segments, one could design a library (of size 212) for which ttte average number of 
mutations per gene can be predicted. This can be a useful way of combining mutations, that 
by themselves give some, but not sufficient effect, without resorting to very large libraries, as 

31 
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is often the case when using 'spiked mutagenesis'. Another way to combine these 'known 
mutations' could be by using family shuffling of oligomeric DNA encoding the known mutations 
with fragments of the full length wild type sequence. S 

In describing the various variants produced or contemplated according to the.invention, 
5 a number of nomenclatures and conventions are used which are described in detail below. A 
frame of reference is first defined by aligning the variant polypeptide with a parent enzyme. A 
preferred parent enzyme is Protease 10 (SEQ ID NO: 2). Thereby a number of alterations will 
be defined in relation to the amino acid ^sequence of SEQ ID NO: 2. £ 

A substitution in a variant is indicated as: 

i ' * ■ 

10 Original amino acid - position - substituted amino acid; 

The three or one letter codes are used, including the codes Xaa and X to indicate any 
amino acid residue. Accordingly, the notation 'T82S" or "Th^Ser" means, that the variant 
comprises a substitution of threonine with serine in the variant amino acid position 
corresponding to the amino acid in position 82 in the parent enzyme, when the two are aligned 

15 as indicated above. 

Where the original amino acid residue may be any amino acid residue, a short hand 
notation may at times be used indicating only the position, and the substituted amino acid, for 
example: 

Position - substituted amino acid; or w 82S" f : *fe 

20 Such a notation is particular rejevant in connection with modific^ition(s) in a series of 

homologous polypeptides. j 

Similarly when the identity of thef substituting amino acid residue^ is immaterial: 
Original amino acid - position; or T82 a f> 

■ ■ 

When both the original amino acid(s) and substituted amino acid(s) may be any amino 

* 

25 acid, then only the position is indicated, e.g.: "82". 

When the original amino acid(s) and/or substituted amino acid(s) may comprise more 
than one, but not all amino acid(s), then the amino acids are listed separated by commas: 
Original amino acids - position no. - substituted amino acids; or "TIOE.D.Y". 
A number of examples of this nomenclature are listed below: 
30 The substitution, of threonine for histcdine in position 91 is designated as: M His91Thr" or 

n H91T B ; or the substitution of any amino acid residue acid for histidine in position 91 is 
designated as: "His91Xaa° or "H91X W 6r"His91 M or M H9T\ 

For a modification where the original amino acid(s) and/or substituted amino acid(s) 

i * I* * 

may comprise more than one, but not all amino acid(s), the substitution of glutamic acid, 

35 aspartic acid, or tyrosine for threonine in position 10: jy 

ThiiOGIu.Asp.Tyr 0 or TIOE.b.Y"; which indicates the spedjjjc variants: "T10E", 

^ ■ - 1 ^ 



T10D", and T10Y". 

t 
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A deletion of glycine in position 26 will be indicated by: n Gly26*" or n G26* M 
Correspondingly, the deletion of more than one amino acid residue, such as the 

deletion of glycine and glutamine in positions 26 and 27 will be designated "Gly26*+Gln27* w or 

"G26*+Q27*" ' 

♦ 

5 The insertion of an additional amino acid residue such as e.g;*a lysine after G26 is 

» 

indicated by: "Gly26GlyLys" or "G26GK"; or, when more than one f amino acid residue is 
inserted, such as e.g. a Lys, and Ala after G26 this will be indicated as^Gly26Glyl r ysAla° or 

"G26GKA". | * " 

In such cases the inserted amino acid residue(s) are numbered t>y the addition of lower 
10 case letters to the position number of- the amino acid residue preceding the inserted amino 

* 

acid residue(s). In the above example the sequences would thus be: ? ; 
Parent: Variant: » 
26 26 26a 26b 

G G K A 

15 In cases where an amino acid residue identical to the existing amino acid residue is 

inserted, it is clear that degeneracy In the nomenclature arises. If for example a glycine is 
inserted after the glycine in the above example this would be indicated by "G26GG". 

Given that an alanine were present in position 25, the same actual change could just 
as well be indicated as "A25AG": 
20 Parent: Variant: 

Numbering I: 25 26 25, 26 26a jfa \ 

Sequence: AG : A G G ; % 

Numbering II: 25 25a 26 «; 

Such instances will be apparent to the skilled person, and the indication "G26GG D and 

25 corresponding indications for this type of insertions is thus meant to comprise such equivalent 

i 

degenerate indications.' 

By analogy, if amino acid sequence segments are repeated in the parent polypeptide 
and/or in the variant, it will be apparent to the skilled person that equivalent degenerate 
indications are comprised, also when other alterations than insertions are listed such as 
30 deletions and/or substitutions. For instance, the deletion of two consecutive amino acids "AG" 
in the sequence "AGAG" from position 194-197, may be written as "A194"+G1956*" or 
"A196*+G197*": 

* 

Parent: ' Variant: 

Numbering I: 194 195 196 197 194 195 

35 Sequence: A G A ' G A G . ;j{ 



Numbering II: J 196 197 
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♦ - 

Variants comprising multiple modifications are separated by pluses, e.g.: 
u Arg170Tyr+Gly195Glu" or "R170Y+G195E", representing modifications in positions 170 and 
195 substituting tyrosine and glutamip acid for arginine and glycine, respectively. Thus, 
Tyr167Gly,Ala,Ser,Thr+Arg1 70Gly,Ala i Ser,Thr" designates the following variants: 

5 "Tyr167Gly+Arg170Gly M , Tyr167Gly+Arg170Ala", a Tyr167Gly+Arg170Ser", 
"Tyr167Gly+Arg170Thr","Tyr167Ala+Arg170Gly". "Tyr167Ala+Arg170Ala", 
"Tyr1 67^8^^17086^, Tyr167Ala+Arg170Thr\ 'Tyr167Ser+Arg1 70Gly", 
"Tyr1 67Ser+Arg 1 70Ala", Tyr1 67Ser+Arg1 70Ser", Tyrt 67Ser+Arg 1 70Thr", 
Tyr167Thr+Arg170Gly*\ *Tyr167Thr+Arg170Ala", Tyr167Thr+Arg170Serf, and 

10 TyrieyThr+ArglTOThr". \ \ 



v. 



This nomenclature is particular relevant relating to modifications. [aimed at substituting, 
inserting or deleting amino acid residues- having specific common properties, such 



modifications are referred to as conservative amino acid modification(s). ; 

■i 

« 

15 The present invention is further described by the following examples which should not 

• * 

be construed as limiting the scope of the invention. 
Examples 

Example 1 : Protease assays 
20 pNA assay 

pNA substrate : jSuc-AAPF-plMA (Bachem L-1400). 
Temperature : Room temperature (25°C) 

Assay buffers :100mM succinic acid, 100mM HEPES, 100mM CUES, 1O0mM CABS. 
1mM CaCI2, 150mM KCI, 0.01% Triton X-10Q adjusted to pH-values 2.0, 2.5, 3.0, 3.5, 4.0, 
25 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 1 1 .0, and 12.0 with HCI or NaOH. 

20*U protease (diluted in 0.01% Triton X-100) is mixed with .1 0j3p.l assay buffer. The 
assay is started by adding 100^1 pNA substrate (50mg dissolved in 1.d|hl DMSO and further 
diluted 45x with 0.01% Triton X-100). The increase in OD405 is monitored as a measure of the 
protease activity. 

30 

Protazvme AK assay 

Substrate : Protazyme AK tablet (cross-linked and dyed casein; from Megazyme) 
Temperature : controlled (assay temperature). 

Assay buffers :100mM succinic acid, 100mM HEPES, 100mM CHES, 100mM CABS, 
35 1mM CaCI2, 150mM KCI, 0.01% Triton X-100 adjusted to pH-values 2.0, 2.5, 3.0, 3.5, 4.0, 
5.0, 6.0, 7.0, 8.0, 9.0, 10.0 and 11.0 with HCI or NaOH. 

* * * 

A Protazyme AK tablet is suspended in 2.0ml 0.01% Triton X-l!(?0 by gentle stirring. 

I 34 K. : 
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500^1 of this suspension and 500ul assay buffer are mixed in an Eppendorf tube and placed 
on ice. 20^1 protease sample (diluted in 0.01% Triton X-100) is added.i;The assay is initiated 
by transferring the Eppendorf tube to an Eppendorf thermomixer, which is set to the assay 
temperature. The tube is incubated fpr 15 minutes on the Eppendorf thermomixer at its 
5 highest shaking rate (1400 rpm). The incubation is stopped by transferring the tube back to 
the ice bath. Then the tube is centrifuged in an icecold centrifuge for a few minutes and 200^1 
supernatant is transferred to a micrbtiter plate. OD650 is read as a measure of protease 
activity. A buffer blind is included in the assay (instead of enzyme). 

i 

to The invention described and claimed herein is not to be limited in scope by the specific 

embodiments herein disclosed, since Uhese embodiments are intended as illustrations of 
several aspects of the invention. Any equivalent embodiments are intended to be within the 
scope of this invention. Indeed, various modifications of the invention; in addition to those 
shown and described herein will become apparent to those skilled in theJart from the foregoing 

15 description. Such modifications are alp intended to fall within the scgpe of the appended 
claims. In the case of conflict, the present disclosure including definitions will control.: 

Various references are cited herein, the disclosures of which are incorporated by 
reference in their entireties. 



hi 
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Claims 

■ u • • 

1. A variant of a parent protease/ comprising a substitution in at least one position of at 
least one region selected from the group of regions consisting of: . •' 
6-18; 22-28; 32-39; 42-58; 62-63; 66-76; 78-400; 103-106; 111-114; ljfe-131; 134-136; 139- 
5 141; 144-151; 155-156; 160-176; 179-181; and 184-188; wherein f& 

(a) the variant has protease activity; and 

(b) each position corresponds to a position of SEQ ID NO: 2; and 

(c) the variant has a percentage of identity to SEQ ID NO: 2 of at least 60%. 

10 2. The variant of claim 1 which comprises a substitution in at least one of the following 

positions: 6; 7; 8; 9; 10; 11; 12; 13; 14; 15; 16; 17; 18; 22; 23; 24; 25; 26; 27; 28; 32; 33; 34; 

35; 36; 37; 38; 39; 42; 43; 44; 45; 46; 47; 48; 49; 50; 51; 52; 53; 54; 55; 56; 57; 58; 62; 63; 66; 

67; 68; 69; 70; 71; 72; 73; 74; 75; 76; 78; 79; 80; 81; 82; 83; 84; 85; 86; 87; 88; 89; 90; 91; 92; 

93; 94; 95; 96; 97; 98; 99; 100; 103; 104; 105; 106; 111; 112; 113; 114; 118; 119; 120; 121; 
15 122; 123; 124; 125; 126; 127; 128; 129; 130; 131; 134; 135; 136; 139? 140; 141; 144; 145; 

146; 147; 148; 149; 150; 151; 155; 156; 160;' 161; 162; 163; 164; 165; 166; 167; 168; 169; 

in. ; 

170; 171; 172; 173; 174; 175; 176; 179; 180; 181; 184; 185; 186; 187; arid/or 188. 

3. The variant of claim 2 which comprises a substitution in at least one of the following 
20 positions: 6; 7; 8; 9; 10; 12; 13; 16; 17; 18; 22; 23; 24; 25; 26; 27; 28; 32; 33; 37; 38; 39; 42; 

43; 44; 45; 46; 47; 48; 49; 50; 51; 52; 53; 54; 55; 56; 58; 62; 63; 66; 67; 68; 69; 70; 71; 72; 73; 
74; 75; 76; 78; 79; 80; 81; 82; 83; 84; 85; 86; 87; 88; 89; 90; 91; 92; 93; 94; 95; 96; 97; 98; 99; 
100; 103; 105; 106; 111; 113; 114; 118; 120; 122; 124; 125; 127; 129; 130; 131; 134; 135; 
136; 139; 140; 141; 144; 145; 146; 147; 148; 149; 150; 151; 155; 156; 160; 161; 162; 163; 
25 164; 165; 166; 167; 168; 169; 170; 171; 172; 173; 174; 175; 176; 179; 180; 181; 184; 185; 
186; 187; and/or 188. 

4. The variant of claim 3 t which comprises at least one of the following substitutions: 6G; 

• » 

7P; 8C; 9C; 10E.D; 12E.D; 13E.D.P; 16C; 17C; 18C; ' 
30 22A,C,D,E,F,G,H,I,K,LM,N,P,Q,R.S,V,W.Y; 23A,C,D.E,F,G > H,I,K i L,M,pJ.Q,R,S,T,V,W,Y; 

24C,D,E,F.G ) HJ,K,L,M,N,P,Q.RJMW;Y;2^ 

26A 1 C 1 D,E I F,H ) l,K,L 1 M,N,P,Q,R,S,T,V,W.Y; 27A,C,D 1 E.F.G,H,I,K,L,M,N,P,R,S,T.V,W,Y; 

28A 1 C,D,E,F,G,H,I,K,L,M,N,Q.R,S,T,V,W.Y; 32C; 33C; 37C; 39R.K; 42Elp; 

43A,C,D,E,F,G,HJ 1 K,L,M 1 NP,Q,R,SJ^ 
35 45A,C,D,E ( F,G,H,K^ > M,NP,Q,R 1 SJXW,Y;46A,C 1 D,E 1 F,HJ,K,L 1 M 1 N,P,"Q,R 1 S,T J V I W t Y; 

47A 1 C,D 1 E,F,G,HJ,K 1 L,M,P,Q,R.S.T.V,W,Y;48A,C,D,E 1 F,H I I,K 1 L,M ( N,P,Q,R,S 1 T,V,W,Y; 
49A,C 1 D,E 1 F i G i H,I,K,L,M > N,P,S 1 V,W > Y; 50A,C,D,E,F 1 H,I,K.L,M,N.P,Q,R,S,T,V,W I Y; 52C; 

* 
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55C; 56R.K; 58E.D; 62C; 63C; 66A,C,D,E 1 F,G,H 1 I 1 K.L,M 1 N I P,Q,S,T,V.W,Y; 
67A 1 C f D,E.F,H,l,K 1 L,M,N,P,Q,R.S,T,V,VV,Y; 68A,C,D,E,F,G,H 1 I,K,L,M i nMQ.R.S.V,W,Y; 
69A,C,D,EP I G,HJ,K,L,M ( NPARJXWX70A,C 1 D,E,F,G,HJ,K,L,MPjQ,R,ST.V.W,Y; 

71A,C.D.E,G,H,I.K > L ( M,N,P,Q 1 R,S,T,V,W,Y; 72A,C.D,E I F,G,HJ,K,L,M,NfeQ,R,S,V,^.Y; 

5 73A > C,D.E,F,G,HJ,K,M,N I PAR,S/rX^^ 

75A,C 1 D,E,F,G I H.I,K,L,M,P,Q,R.S.T,V,^,Y;76b; 78A,C > D.E.F,G,H,l,k.q,Jw i N,P,Q,R,T.V,W,Y; 

79A ( C,D,E,F > G,HJ,KX,M.NP.QATXWy;8dA > C,D,E.F,G,HJ,K 1 L 1 M,N!|Q,R,S.T ( V,W; 

81AAD,E,F,G,HJ > K 1 L,MP,Q I R,ST,V,WX82A,C,D,E,FAHJ,K,L,M,N.PA.R.V,W,Y; 

83A 1 C,D 1 E,F.H,l,K 1 L 1 M,NPA.R,SJXW l Y;84A I C,D,E I F.HJ f K,L > M,N,P,Q.R ) SXV^^^ 
10 85A,C > D I E.F.G,HJ,K.UM,N,PAR.S,T,^,W;86C,D,E 1 F,G,H,I,K 1 L,M 1 N,P 1 R,S 1 T ) V,W 1 Y; 

87A,C,D 1 E,F 1 G,H 1 I 1 K > L,M,N,P,Q I R ? V 1 W,Y; 88A 1 C,D,E,F.G 1 H,l 1 K,L,M t N ) P,Q,R.S,T.W,Y; 

89C 1 D 1 E,F > G,H,I,K 1 L,M I N,P,Q,R,V,W,Y; 90A,C > D,E,F,H I I,K,L,M I N,P.Q,R,S 1 T 1 V,W,Y; 92P.R.K; 

93P; 94C.P; 95E.D; 96E.D.P; 97R.K; 98P; 99R.K; 103C; 105C.P; 106C; 111R.K; 113E.D; 

118R.K; 120E.D; 122K; 124R.K; 125P; 127R.K; 129E.D; 130E.D; 134C; 135P; 136P; 139C; 
15 140E.D; 141C; 144C; 145C; 146C; 147W; 148C; 149C; 150E.D; 151P.E.D; 155C; 156C; 

160A,C,D,E,F,HJ,K^,M 1 N ) P,Q.R,ST/^^ 

162A,C,D,E I F,H,l,K,L,M,N,P.Q,R,S,T,v'.W,Y; 163A.C l D t E,F,G,HJ,K,L.M,F>.Q.R,S,T,V,W,Y; 
164A.D,E,F,G.H,I,K,L.M,N.P.Q.R.S.T.V;W,Y; 165A 1 C 1 D I E,F,G,H,I,K,L,M^,P,Q,T.V,W,Y; 
166A,C,D,E,G,HJ,KX.M f NP,Q,R,S,W,f,167A l C,D,E l F,HJ,K l L,M,N,P^jR.SJXW 
20 168A ( C,D,E,F,H,l,K,L,M l N,P,Q.R.S l T.v!w,Y; legA.CD.E.F.G.H.I.K.UM^.P.Q.R.S.V.W.Y; 
170A,C 1 D I E ) F,G,H,I,K,L I M,N,P,Q,R 1 S,V,W,Y; 172C; 173C; 174P; 175P;|jr6P; 180R.K; 
181R.K; 184P, 187P; and/or 188R.K. ' 

i 

■ 

5. The variant of any one of claims; 1 -4 which comprises at least one of the following pairs 
25 of substitutions: 6C+103C; 8C+105C; 76C+85C; 94C+149C; 55C+63C; 16C+145C; 

33C+144C; 62C+173C; 106C+141C; 9C+17C; 18C+156C; 32C+144C; 37C+52C; 67C+71C; 
134C+170C; 139C+163C; 146C+148C;'and/or 155C+172C. 

t 

6. The variant of claim 5, which comprises at least one of the following pairs of 
30 substitutions: 6C+103C; 8C+105C; 76C+85C; 94C+149C; 55C+63C; 16C+145C; 33C+144C; 

62C+173C; and/or 106C+141C. J 

7. The variant of any one of claims 1-4 which comprises at lejastrone of the following 
substitutions: 81 P; 82P; 151P; 176P; 24P; 25P; 92P; 93P; 94P; 96P; ; 98R; 105P; 136P; 184P; 

35 187P; 174P; 7P; 13P; 23P; 27P; 125P; 135P; and/or 175P. s tf 

8. The variant of claim 7, which comprises at least one of the following substitutions: 81 P; 

« • ■ 
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82P; 151P; 176P; 24P; 25P; 92P; 93P; 94P; 96P; 98P; 105P; 136P; 184P; and/or 187P. 

9. The variant of any one of claims 1-4 which comprises at least one of the following 
substitutions: 95E.D; 42E.D; 84E.D; 96E.D; 47E.D; 46E.D; 150E.D; 70E.D; 13E.D; 140E.D; 

5 10E.D; 151 E,D; 129E.D; 130E.D; 166E.D; 161 E,D; 120E.D; 82E.D; 58E.D; 12E.D; 81E.D; 
69E.D; 11 3E,D;89E,D; and/or 160E.D.; \ :[| 

! • U 

10. The variant of claim 9 which bomprises at least one of the fallowing substitutions: 

95E.D; 42E.D; 84E.D; 96E.D; 47E.D; 46E.D; 150E.D; 70E.D; 13E.D; ano/pr 140E.D. 

10 " ! '\ 

11. The variant of any one of claims 1-4 Which comprises at least 5 one of the following 
substitutions: 124R.K; 72R.K; 97R.K; 127R.K; 56R.K; 122R.K; 181R.K; 180R.K; 25R.K; 
92R.K; 39R.K; 99R.K; 111R.K; 24R.K; 118R.K; 162R.K; and/or 188R.K. 

15 12. The variant of claim 11 which comprises at least one of the following substitutions: 
124R.K; 72R.K; 97R.K; 127R.K; 56R.K; 122R.K; 181R.K; 180R.K; 25R.K; and/or 92R.K. 

13. The variant of any one of claims 1-4 which comprises at least one of the following 
substitutions: 147W; 43W. ; .!; 

20 

14. The variant of any one of claims 1-4 Which comprises a substitution in at; least one 
position of at least one region selected from the group of regions consisting of: 

(i) 160-170, 78- 90, 43-50. 66-75, and 22-28; $ 

(ii) 160-170, 78-90, 43-50. and 66-75; ! 

25 (iii) 160-170, 78-90, and 43-50; ; :. 

(iv) 160-170, and 78-90; and/or 

(v) 160-170. 

1 5. The variant of any one of claims 1 -4 which comprises at least one of the following 
30 substitutions: 6C; 8C; 13E.D; 16C; 24P; 25K.P.R; 33C; 42E.D; 46D.E; 47D.E; 55C; 56R.K; 

62C; 63C; 70D.E; 72K.R; 76C; 81 P; 82P; 84D.E; 85C; 92P.R.K; 93P; 94C.P; 95E,D; 96E,D,P; 
97R.K; 98P; 103C; 105C.P; 106C; 122R.K; 124R.K; 127R.K; 136P; 140E.D; 141C; 144C; 
145C; 149C; 150E.D; 151P; 173C; 176P; 180R.K; 181R.K; 184P; and/or 187P. 

» 

•ri 

■ * 

35 16. The variant of any one of claims 1-3, which comprises at least one of the following 
substitutions: G6C; L7P; A8C; Y9C; T1^>E,D,Y; G12E.D; G13E.D.P; $1 6G; V17C; G18C; 
T22A 1 C,D,E,F 1 G,H,I,K,L 1 M 1 N,P,Q,R,S,VW^ 

i • ;|f. 
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A24C,D.E,F,G,H > I,K,L,M,N,P I Q,R 1 S,T,V.W,Y (preferably A24S); 
A25C,D,E,F,G > H,I,K,L,M,N,P I Q,R,S,T,V,W,Y (preferably A25S); 

G26A 1 C,D I E 1 F I HJ I K,L,M 1 N,P 1 Q,R 1 SJ I V,W,Y;Q27A ( C I D,E,F,G ) H.I > K,L,M.N,P,R.S,T.V,W,Y; 
P28A,C,D,E,F,G,H,I,K,L I M,N,Q,R 1 S,T,V,W I Y; T32C; A33C; G37C; R38T; V39R.K; 
5 Q42E,D,G,P; V43A ) C,D I E,F,G,H,I,K,L,M,N,P,Q,R.S.T.W,Y; 

T44A,C,D.E,F I G,H,I,K,L I M 1 N I P,Q.R,S,V.W,Y (preferably T44S); 

I45A 1 C,D,E,F,G.H.K,L 1 M,N,P,Q,R.ST,V I W I Y;G46A,C ( D,E 1 F,HJ,K,L,M.N,P,Q,R,S,T,V I W,Y; 
N47A > C > D,E,F > G,HJ,K,L < Mp,Q I R 1 SJXW,Y;G48AAD,E I F,HJ,K,L t M,^;P,Q.R,S.T,V,W,Y; 

R49A,C,D,E 1 F,G.H.I,K,L,M,N,P > Q,S,T.V,W,Y (preferably R49T.Q); j|. ! 
10 G50A,C,D,E,F,H,l 1 K I L,M 1 N,P 1 Q,R 1 S.T,y,W,Y; V51T; F52C; E53Q; Q54I*R; S55C; X/56R.K; 
P58E.D; A62C.S; A63C; R66A,C,D,E,F-G,H,I.K,L ( M,N I P I Q,S,T,V,W,Y; >!•! 
G67A I C,D,E I F,HJ,K,L,M ( NPAR,SJ,V,W,Y;T68A,C,D 1 E 1 F,G,HJ,K,L,MiNP,^ 
S69A,C,D,E 1 F,G,HJ,K,L I M I N ) P,Q,RJXW,Y;N70A,C,D 1 E I F,G,H,I,K,L I M,P.Q ) R.S,T,V I W,Y; 

F71A,C,D,E 1 G 1 H,I,K,L,M,N,P,Q 1 R 1 S,T.V,W,Y; T72A,C,D,E,F,G,H,I,K 1 L,M,N,P,Q.R,S,V,W I Y; 
15 L73A,C,D,E,F,G,HJ I K t M,N 1 P 1 Q,R I SJXW,Y;T74A,C,D 1 E,F,G 1 HJ,K,L,M,N 1 P.Q,R.S 1 V.W,Y; 

N75A,C,D,E 1 F,G,H,I,K,L,M,P,Q > R ( S,T.V.W,Y;L76C; 

S78AAD,E,F,G,HJ,K,L I M I NPAR,T,V,W t Y;R79A,C,D I E ) F,G,H,l,K,L,M > N 1 P,Q > S,T.V.W > Y; 

Y80A,C,D,E > F,G,H 1 I,K,L I M,N,P,Q,R ( S 1 T,V,W; N81A.C,D,E,F,G 1 H,I,K,L,M,P 1 Q,R,S,T.V,W,Y; 

T82A I C i D,E I F,G,H 1 l 1 K,L,M,N,P ) Q,R,S,V 1 W,Y (preferably T82S); 
20 G83AAD,E,F > HJ,K,L > ^N I P I Q I R,STXW,Y;G84A,C,D,E,F I H,I,K,L,M 1 N,P,Q,R,S,T,V I W,Y; 

Y85A > C.D,E,F I G 1 HJ,K,L I M 1 NPA I R,S,ty i W;A86C,D,E,F,G l H,l,K,L,M,N,P,Q,R,S I T.V.W,Y 

(preferably A86Q); T87A,C,D > E 1 F,G,H,liK 1 L 1 M,N I P,Q,R,S,V,W,Y (preferably T87S); 
V88A,C.D > E,F,G,HJ.KX.M > NP,Q I R,S,T,W,Y;A89C,D,E,F,G,HJ,K.L,M,^P 1 Q,R.SJ,V.W ( ^ 

(preferably A89T.S); G90A,C,D 1 E,F,H,lkL 1 M,N.P,Q,R,S ) T,V I W,Y; H91TjS; N92P,r;k.S; 
25 Q93P; A94C.P; P95A.E.D; I96A,E,D,P;!g97R i K; S98P; S99A.Q.R.K; yifjpi; S103C; Si05C,P; 

T106C; C111R.K; T113E.D; 1114V; G118N,R,I<; S120T.E.D; S122R.K; f^24R.K; E125P.Q; 

T127R.K; T129E,D,Y,Q; N130E,D,S; M131L; T134C; T135P.N; V136P; E139C; P140E.D; 

G141C; G144C; G145C; S146C; Y147F.W; I148C; S149C; G150E.D; T1$1P,E,D,S; G155C; 

V156C;G160A 1 C,D,E,F,H ( I,K,L,M,N,P,Q,R,S,T,V,W,Y; '. 
30 S161A ( C,D,E,F,G,HJ,K^ 1 M I N 1 P 1 Q,RJMW,Y;G162A > C ) D.E.F,HJ,K,L,M.N > P.Q,R,S.T,V.W.Y; 

N163A > C,D,E.F,G,H,I,K.L,M.P.Q,R.S.T.V,W I Y; C164A,D,E,F 1 G I H 1 I,K,L.M,N.P,Q I R,S 1 T,V.W 1 Y; 

R165A,C l D ( E,F,G.H,l,K,L.M.N,P.Q i S,T,V,W.Y (preferably R165S); 

T166A,C,D,F,E,G,H,I,K,L,M,N 1 P,Q,R,S.V,W.Y (preferably T1 66V,F); 

G167A I C 1 D 1 E 1 F,H.I,K,L.M.N,P.Q,R,S > T,V.W,Y; G168A,C,D,E,F,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y; 
35 T169A,C,D.E,F.G,HJ,K;L > M.N,PAR.SMW i Y;T170A i C,D,E,F,G,H,I,K,L,M,N,P,Q,R > S,V,W,Y; 

F171Y; Y172C; Q173C; E174P; V175P; T176N.P; V179I.L; N180R.K.S; S181R.K; V184L.P; 

»/*: 

R185T; L186I; R187P; and/or T188R.K. 
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17. The variant of any one of claims 1 1-3, which comprises at least onfeiof the following 

♦ 

substitutions: G6C; L7P; A8C; Y9C; Yli)E.D,T; G12E.D; G13E.D.P; S16C; V17C; G18C; 
T22A 1 C,D ( E 1 F,G,HJ > K 1 L,M,N,P 1 Q,R,SXW,Y;N23A 1 C,D 1 E,F,G,H ( I,K < L,M,P,Q,R,S,T,V,W,Y; 

5 S24A,C,D,E I F t G,H ( l,K,L,M,N,P,Q,R,T,V,W,Y (preferably S24A); 
A25C I D,E,F,G 1 H,I,K,L,M,N,P > Q,R.S,T,V,W I Y (preferably A25S); 

G26AAD ( E,F,H^K,L,M,NP I Q,R,ST.y i W,Y;Q27A,C,D > E 1 F,G,H,l 1 K,L,M,N,P.R,S,T,V ( W I Y; 

P28A,C 1 D,E,F,G,H 1 I,K I L,M,N,Q,R,S,T,V,W,Y; T32C; A33C; G37C; T38R; V39R.K; 
G42E,D,P,Q; V43AAD,E,FAHJ,K,LMN,PAR,S.T.W,Y; 
10 T44A I C 1 D 1 E I F,G,H,I,K 1 L,M,N 1 P,Q,R.S,V,W,Y (preferably T44S); y, 

I45A,C,D > E 1 F,G I H ) K,L,M,N,P 1 Q 1 R,SJ 1 V,W,Y;G46A 1 C,D I EP,HJ,KX.M,N,PAR^ 
rW7A,C ) D,E 1 F,G,HJ,K,L,M,P,Q,R,ST,V,W,Y;G48A,C I D I E,F,HJ f K,L,M 1 ^ 1 P,Q,R,STM 
T49A,C,D ( E,F I G I H,I,K,L,M,N,P,Q,R,S,\/,W,Y (preferably T49R.Q); T51 
G50A > C,D,E 1 F,H 1 I.K,L,M,N,P,Q.R,S,T.V,W,Y; F52C; Q53E; N54Q.R; S55C; V56R.K; P58E.D; 

» • * • 

15 A62C.S; A63C; R66A,C,D,E,F,G,H,I,K,L,M,N,P [ Q,S.T,V,W,Y; rfl 

G67AAD,E > F,HJ,K,L,M,N.PAR.SJXW,Y;T68A,C,D 1 E,F,G,HJ,K,L.M;N,P,Q.R,S,V,vV,Y; 

S69A ) C,D,E,F 1 G,H,l,K,L,M,N,P 1 Q,R I T,y,W,Y; N70A,C.D,E,F,G,H,I,K 1 L,M.P.Q,R.S,T.V,W,Y; 
F71A,C,D l E l G,H,l,K,L I M,N,P,Q,R l S,T,y,W,Y; T72A,C 1 D,E,F,G,H,l,K 1 L I M > N ) P 1 Q,R,S i V,W 1 Y; 
L73AAD 1 E 1 F 1 G,HJ l K,M I N I P I Q,R,SXV,W 1 Y;T74A,C t D 1 E 1 F,G,H,l 1 K,L 1 M,N,P,Q,R,S 1 V 1 W,Y; 

20 N75A, C, D,E, F,G ,H,I,K, L,M, P, Q, R.S.T.V.W, Y; L76C; 

S78A ) C,D ) E 1 F,G,HAKX.M 1 N J P,Q,RJ i y,W,Y;R79A,C > D 1 E,F,G 1 H,l 1 K.L,M 1 N 1 P.Q,S,T,V ) W,Y; 

Y80A,C,D,E,F,G,H,I,K,L,M,N > P,Q,R,S 1 T,V ( W; N81A 1 C,D,E,F,G,H,I 1 K,L,M,P,Q,R,S,T,V,W,Y; 

S82A I C,D,E,F,G,H,I,K,L > M,N,P,Q,R,T,V,W,Y (preferably S82T); 

G83A 1 C,D,E 1 F,HJ,K^,M,N,PAR,S,T,y,W,Y;G84A,C l D,E,F,HJ,K,L,M,N,P,Q.R,STXW,Y^ 
25 Y85AAD,E,F > G,HJ,K > L,M ) N,P 1 Q,R,S,TXW^ 

(preferably Q86A); S87A,C,D,E,F 1 G,H,|Jk,L.M,N 1 P,Q,R,T,V,W,Y (preferably S87T); , 

V88AAD,E,FAHJ,K,L,M,NPAR.S.T,W,Y;t89AAD,E,F,G,HJ,KX,^;N,P,Q,R,^ 

(preferably T89A.S); G90A 1 C I D > E,F,H,I I K,L,M I N I PAR,S,T I V,W,Y; T91^;S; S92P,R.K,N; 

Q93P; A94C.P; P95A.E.D; A96E,p,i,P;G97R,K; S98P; A99R.K.S; V10o£si03C; S105C,P; 
30 T106C; C111R,K; T113E.D; 1114V; N118G.R.K; T120E.D.S; R122K.S; P1|24R,K; Q125E.P; 

T127R.K; Y129E.D.T; S130E.D.N; L13iM; T134C; N135P.T; V136P; E139C; P140E.D; 

G141C; G144C; G145C; S146C; F147\A/,Y; I148C; S149C; G150E.D; S151P.E.D.T; G155C; 

V156C; G160A 1 C,D 1 E,F,H,I,K I L I M,N,P,Q ) R,S I T,V,W,Y; 

S161A 1 C,D,E,F,G.H,I,K,L,M 1 N.P,Q,R,T,V I W,Y; G162AAD,E 1 F 1 H 1 I > K I L,M,N,P.Q,R,S I T,V 1 W 1 Y; 
35 N163A,C I D,E,F I G,HJ,K,L 1 MP,Q,R > SJ > V,W 1 Y;C164A,D,E,F,G I HJ,K,L 1 M,N 1 P ( Q,R,S.T,V,W ) Y; 
S165A,C I D I E,F 1 G.H,I,K,L,M,N,PAR.T,V,W,Y (preferably S165R); 
V166A,C 1 D,E.G,H,I I K,L,M,N,P,Q ) R,S,T,VV 1 Y (preferably V166F.T); 

* 
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G167A.C,D.E.F,HJ.K,L,M,N,P > Q,R I SJ(V > W>;G168A,C,D l E ) F l HJ,K.L,l^;N,PP,R.SJ,V,W I ^ 
T169A,C,D,E,F,G,HJ l K^,M,NP > Q,R.S i V,W,Y;T170A,C,D.E,F,G,HJ,k,4M,N l P,Q ( R > S,V,W,Y; 

Y171F; Y172C; Q173C; E174P; V175P; T176P.N; I179L;V N180R.K.S; S181R.K; V184L.P; 
R185T; 11 86L; R187P; and/or T188R.K: I : 

i i "\ 

1 8. The variant of any one of claims 1 -3, which comprises at least one of the following 

substitutions: G6C; L7P; A8C; Y9C; T10E.D.Y; G12E.D; G13E.D.P; S16C; V17C; G18C; 

T22A 1 C,D,E,F.G.H 1 I.K.L,M.N.P.Q,R 1 S,V,W > Y; N23A,C.D,E,F,G,H,I,K,L,M,P.Q,R.S,T,V,W,Y; 

A24C,D.E,F,G,H.I,K,L,M,N,P,Q,R,S,T t \/.W,Y (preferably A24S); 
10 A25C,D,E > F I G,H,I 1 K,L,M,N I P,Q,R,S I T,V,W,Y (preferably A25S); 

G26A,C,D,E,F,H,I,K,L,M,N I P,Q,R,S,T,V 1 W,Y; Q27A,C,D 1 E I F 1 G,H 1 I,K,L,M,N,P.R,S,T,V,W.Y; 

P28A,C,D,E > F.G.H.I.K,L.M,N,Q I R.S,T,V,W,Y; T32C; A33C; G37C; R38T;V39R,K; 

Q42E,D,G,P; V43A,C,D I E,F 1 G,H I I.K 1 L,M,N.P.Q,R,S,T,W,Y; A] 

S44A,C,D ( E 1 F,G,H,I,K 1 L I M I N,P,Q,R,T,V,W,Y (preferably S44T); if! 
15 I45A,C,D,E.F,G,H 1 K^,M,NP,Q,R,SJ,V,W,Y;G46A,C,D,E,F.HJ,K.L,M,^;P,Q,R.ST,V.W.^ 

N47A,C I D > E,F,G,HJ I K,L l M 1 P,Q,R,SJ,V,WXG48A 1 C.D ( E,F,HJ,K,L.M,N,f.Q.R.STy,W,^ 

Q49A 1 C ) D,E,F,G,H,l l K,L,M 1 N 1 P I R,S,T,v',W,Y (preferably Q49R.T); j£ 
G50A 1 C,D,E > F I H,l,K,L,M,N,P,Q,R,S,T ) y,W,Y; V51T; F52C; E53Q; Q54N-R; S55C; I56R,K; 
P58E.D; A62C.S; A63G; R66A,C,D 1 E,FiG,H I l,K,L,M,N,P,Q,S,T,V,W 1 Y; ; j 
20 G67AAD,E,F,HJ ) K,L,M I Np i Q,R,S,T I V,W t Y;T68A,C,D,E,F,G,HJ 1 KX 1 M;N,P,Q,R,SXW,Y; 
S69A,C > D > E,F,G 1 HJ,K,L,M 1 N 1 P,Q l RJ,V,W,Y;N70A,C,D,E,F 1 G 1 H,l I K,L,M,P,Q,R ) S 1 T,V,vV,Y; 
F71A,C 1 D,EAHJ 1 K^,M,N,P 1 Q,R,S 1 T,V,W,Y;T72A,C,D,E,F,G,H,I,K,L,M,N,P 1 Q,R,S.V.W,Y; 
L73A,C.D,E.F,G 1 HJ > K,M,NP,Q,R 1 SJ,V,W,Y;T74A,C,D,E,F,G,H,I,K,L,M 1 N,P,Q.R,S,V.W,Y; 

N75A,C 1 D,E,F 1 G,H 1 I,K 1 L,M,P,Q,R,S.T,V,W,Y; L76C; 
25 S78A,C,D,E 1 F,G.HJ,K,L.M,N > P 1 Q,R.T,V.W,Y;R79A,C,D,E,F,G,H I I I K ) L,M,N 1 P,Q.S,T,V,W,Y; 

Y80A,C,D,E I F I G,HJ > KX,M,N,PAR,S,fxW;N81A.C.D,E,F,G.HJ,K,L,M,P,Q.R,S,T.V.W,Y; 

T82A,C 1 D,E,F,G,H,l,K.L.M,N,P,Q.R.S,y.W.Y (preferably T82S); 

G83A 1 C,D,E,F > HJ,K^,M,NP > Q,R.SJ,V,W,Y;.G84A,C.D,E,F 1 HJ > K,L 1 M,N;P,Q,R.S 

Y85A ( C,D,E,F,G,HJ I K,L,M,N,P.Q.R,S,TXW;A86C,D,E.F,G,HJ > K^,M,N,P 1 Q,R,SJ I V,W 

30 (preferably A86Q); T87A I C,D,E,F,G,H,liK,L,M,N,P,Q ( R,S,V,W,Y (preferably T87S); 

V88A,C.D.E 1 F,G.HJ,KX.M,N,P,Q.R.S,W 

(preferably A89T.S); G90A,C,D.E,F.H,ljK 1 L.M,N,P,Q 1 R,S.T,V,W,Y; H9ltp; N92P,R,K,S; 
Q93P; A94C.P; P95A.E.D; I96A,E,D,P;;G97R,K; S98P; S99A,Q,R,K; V1O0I; S103C; S105C.P; 
T106C; C111R.K; T113E.D; 1114V; G118N.R.K; S120E.D.T; S122R.K; P124R.K; E125P.Q; 
35 T127R.K; T129E.D.Q.Y; N130E.D.S; M131L; T134C; T135N.P; V136P; E139C; P140E.D; 
G141C; G144C; G145C; S146C; Y147F.W; I148C; S149C; G150E.D; N151P,E,D,T; G155C; 

* * * * 

V156C;G160A 1 C,D,E,F > H,I,K,L,M,N,P,Q,R,S,T.V,W,Y; 

41 



Best Availa ble Copy 

* 

10508.000-DK 

S161AAD I E 1 F,G,HJ,KX,M,N,P 1 Q.RXV^^^ 
N163A,CAE.F,G l HJ.K l L,M,PAR.Sj;V,W l Y;-C^^ 
R165A,C,D,E,F,G,H,I,K,L,M,N,P,Q,S I T^V,W,Y (preferably R165S); . |' 

T166A.C,D,E.F.G,H,I.K,L,M,N,P,Q,R f S,V,W.Y (preferably T166F.V); j |j 
5 G167A,CAE,FMI < K,LMN l PAR.sMv,W,Y; , G168A,^ 

T169A,C.D.E,F,G.H i I.K,L,M,N,P,Q,R,S;XW.Y;T170A,C i D,E,F,G,HJ,k|.MNP 

F171Y; Y172C; Q173Q; E174P; V175P; T176N.P; V179I.L; N180R.K.S; &181R.K; V184L.P; 

R1 85T; L1 86I; R1 87P; and/or T1 88R.K. : : 

■ 

10 19. The variant of any one of claims 1-3, which comprises at least one of the following 
substitutions: G6C; L7P; A8C; Y9C; T10E.D.Y; G12E.D; G13E.D.P; S16C; V17C; G18C; 
T22A,C,D,E,F I G,H,I 1 K,L,M,N,P,Q,R,S,V,W,Y; NaSA.C.D.E.F.G.H.I.K.L.M.P.Q.R.S.T.V.W.Y; 
A24C,D,E,F,G,H,I,K,L,M,N 1 P.Q,R,S,T,V,W 1 Y (preferably A24S); 
A25C,D,E,F,G,H,I,K,L,M.N,P,Q.R,S 1 T I V 1 W,Y (preferably A25S); 

15 G26A,C,D.E,F > H,I,K,L,M,N,P,Q > R,S,T,V ( W,Y; Q27AC,D,E,F.G,H l I.K.L,M.N,P.R f S,T,V,W,Y; 
P28A,C 1 D.E,F,G,H,I 1 K,L,M,N,Q,R,S,T,V,W,Y; T32C;A33C;G37C; R38TJ ; V39R.K; 
Q42E,D,G,P; V43A > C,D,E,F 1 G,H,I ) K,L,M,N 1 P,Q.R,S,T,W 1 Y; >{ ' 

T44A l C,D,E,F t G,H,l,K 1 L,M,N,P ) Q,R,S,V,W,Y (preferably T44S); • jj, 
I45A,C I D,E > F 1 G,H,K,L,M 1 N,P,Q 1 R,S,T,V,W,Y; G46A,C,D I E,F,H,I,K,L,M,^P,Q,R,S,T ! V,W,Y; 

20 N47A,C.D,E.F I G,HJ > K,L,MP,Q 1 R l SJ,yW^ 

R49A,C,D,E 1 F 1 G,H t l,K,L,M,N,P,Q,S,T,y,W,Y (preferably R49Q.T); % 
G50A 1 C,D,E.F,H,I,K,L,M,N.P,Q,R,S,T,V,W,Y; V51T ; F52C; E53Q; Q54N.R; S55C; I56R.K; 
PS8E.D; A62C.S; A63C; R66A 1 C,D,E.F 1 G,H,I > K,L 1 M.N,P.Q,S,T.V.W I Y; 
G67A > C,D,E,F 1 HJ,K,L,M,NP,Q,R I ST.V.W,Y;T68A,C I D,E,F,G,H,I,K 1 L,M ) N,P,Q 1 R,S,V,W,Y; 

25 S69A,C,D,E,F > G,H,I,K.L,M,N,P.Q 1 R,T,V,W,Y; N70A 1 C > D,E,F 1 G,H,I,K,L,M,P,Q,R,S,T,V,W,Y; 
F71A,C 1 D,E,G,HJ,K,L 1 M I N,P,Q 1 R,S,T 1 V,W,Y;T72A,C,D,E,F,G,H I I 1 K,L 1 M,N,P I Q,R,S 1 V.W.Y; 
L73A,C,D,E I F,G.HJ,K,M,N,P,Q.R.S,T,V,W,Y;T74A,C 1 D > E,F 1 G,H,I,K,L,M,N I P,Q.R,S,V,W,Y; 

f 

N75A,C,D.E,F,G,H,I,K,I. I M,P,Q,R 1 S 1 T,V,W,Y; L76C; 

S78A l C,D 1 E i F,G,HJ,K,L 1 M 1 N,P 1 Q 1 R.T,V,W,Y;R79A,C I D l E,F,G,HJ,K,L,M.N,P,Q,S,T,V,W,Y; 

30 Y80A,C.D,E 1 F,G.HJ,K,L,M,N,P,Q 1 R.S,TXW 

T82A,C,D ) E,F 1 G,H.I.K,L,M,N,P,Q ) R,S,V,W,Y (preferably T82S); % 
G83A,C.D,E,F,H,I,K,L,M,N,P,Q,R,S,T,V,W,Y; G84A,C,D,E,F,H,l,K,L,M,lvl,P 1 Q i R.S,T,V,W,Y; 
Y85A ) C,D.E,F > G,H,I,K,L,M,N.P.Q,R,S,T,V,W; A86C,D,E,F ) G,HJ > K,L.M,N>.Q,R,S,T,V t W,Y 
(preferably A86Q); T87A > C,D 1 E I F 1 G I H,i 1 K,L.M,N I P,Q,R,S,V,W,Y (preferably T87S); ■ 

35 V88A ) C 1 D,E,F,G,HJ t K 1 L,M,N,P > Q,R,S,T,W,Y;A89CAE,F 1 G,HJ,K,L > M,N ; P,Q 1 R 1 S,Ty,W 
(preferably A89S.T); G90A ( C,D,E,F,H,i!k,L,M,N,P,Q,R,S,T,V,W,Y; H91T& N92P,R,K,S; 
Q93P; A94C.P; P95A.E.D; I96A,E,D,P; G97R,K; S98P; S99A,Q,R,K; V100I; S103C; S105C.P; 
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T106C; C111R.K; T113E.D; 1114V; G118N.R.K; S120E.D.T; S122R.K; P124R.K; E125P.Q; 
T127R.K; T129E.D.Q.Y; N130E.D.S; M131L; T134C; T135N.P; V136P; E139C; P140E.D; 
G141C; G144C; G145C; S146C; Y147F.W; I148C; S149C; G150E.D; N^51P,E,D,S; G155C; 
V156C;G160A I C,D,E,F,H.!.K,L.M,N,P,Q.R.S > T,V,W I Y; |j. 

5 S161A,C,D > E,F,G,HJ,K,L,M,NP I Q,Rj;m 

N163A,C > D,E,F l G,H l l,K,L,M l P,Q,R.S,Tiv,W,Y; C164AAE,F,G I H^K, : L,M,N.P.Q.R,S.T,V,W,Y; 

R165A,C,D,E,F,G,H,l,K,L,M > N f P,Q,S,T;V,W,Y (preferably R165S); jl ; 

T166A,C,D,E,F,G,H,I,K,L,M,N,P,Q,R,S 1 V,W,Y (preferably T166V); ' fjs 
G167A l C.D,E,F,H,l,K.L,M,N,P,Q,R.S,Tiv,W l Y; G168A,C.D I E,F,HJ.K,L,M,;N,P,Q.R,S,T,V,W,Y; 
10 T169A,C,D l E.F l G,HJ,KX,M l NP l Q,R.siv,W,Y;T170AAD,E I FAHJ,KX/M,NP,Q,R I S,V,W l Y; 

F171Y; Y172C; Q173C; E174P; V175P; T176N.P; V179I.L; N180R.K.S; S181R.K; V184L.P; 
R185T; L186I; R187P; and/or T188R.K. 

20. The variant of any one of claims 1-3, which comprises at least one of the following 

15 substitutions: G6C; L7P; A8C; Y9C; T1QE.D.Y; G12E.D; G13E.D.P; S16C; V17C; G18C; 

T22A,C,D > E,F,G,HJ,Ki,M 1 N,P.Q.R I S,y,W,Y;N23A I C,D,E,F,G,HJ.K,L,M,P I Q,R,S,T,V,W,Y; 

} 

A24C,D,E 1 F,G,H,l,K,L,M,N,P,Q,R,S 1 T,y,W,Y (preferably A24S); , ( 

S25A,C,D 1 E,F,G,H I l,K,L,M ( N,P.Q,R,T.y.W I Y (preferably S25A); {' • 

G26A I C,D 1 E,F,HJ I K,L,M,NP I Q,RAT.V,W,Y;Q27A,C 1 D 1 E,F,G i H,l 1 K,L ) ^,N,P I R 1 STX^ 

20 P28A > C,D,E ) F,G,H,l,K,L,M,N 1 Q,R,S,T.\] r > W,Y; T32C; A33C; G37C; T38Rj-'V39R,K; 
P42E,D,G,Q; V43A,C 1 D,E ( F,G I H,I,K,L,^.N,P 1 Q,R,S,T.W 1 Y; ; jj ' 

S44A 1 C,D,E,F,G,H,I.K,L.M,N,P.Q,R 1 T,V,W,Y (preferably S44T); $ 
l45A,C,D,E,F i G.H,K,L,M,N,P l Q,R,S J.V.W.Y; G^ 

N47A,C,D,E,F,G,H ) l,K I L ) M,P 1 Q,R,SJ,y,W,Y;G48AAD 1 E,F,HJ,K I L,M 1 N,f>,Q,R,S 
25 Q49A 1 C,D ) E,F,G,H 1 I,K,L,M,N,P,R,S,T.V 1 W,Y (preferably Q49R.T); 

G50A 1 C,D 1 E,F.H,I,K,L,M 1 N,P 1 Q,R,S,T,V I W,Y; V51T; F52C; E53Q; R54N.Q; S55C; V56R.K; 
P58E.D; S62A.C; A63C; ReeA.CD.E.F.G.HJ.K.L.M.N.P.Q.S.T.V.W.Y; 
G67A 1 C,D > E,F,HJ,K,L 1 M,N > P 1 Q,R,S,T,V,W,Y;T68A,C,D 1 E I F,G ) H 1 I 1 K,L 1 M,N 1 P 1 Q,R,S,V,W,Y; 

S69A,C > D,E,F I G,H I I,K,L,M,N,P,Q,RJ.V.W,Y; N70A,C,D,E,F ) G,H,I,K,L,M,P,Q,R,S,T,V,W 1 Y; 
30 F71A,C,D,E,G,HJ,K.L,M,NP.Q,R,SJ,y.W,Y;T72A.C.D t E l F,G l HJ.K I L.M f N,P,Q,R,S,V,W,Y; 
L73A,C,D,E,F,G,H,l,K,M,NP,Q 1 RAT,y.W,Y;T74A > C t D ) E,F.G,HJ,K.L,M,Np.Q,R,S,V 

N75A 1 C.D,E > F.G,H,l,K,L,M,P,Q p R,S,T.y.W,Y;i:76C; J|_j 
S78A,C,D > E,F,G 1 HJ.K,L,M ( N,P,Q,RJ,yXY;R79A,C,D > E,F,G,HJ,KX.HN 1 P.Q,ST^ 
Y80A,C.D,E,F ( G,H,I I K ( L,M.N,P,Q,R.S,T,V 1 W; N81A ) C I D,E,F,G,HJ,K|,l^;P l Q,R > S > T.y,W,Y; 
35 T82A 1 C 1 D 1 E,F,G,H,l,K 1 L.M 1 N,P > Q,R I S,y,W,Y (preferably T82S); : jf . 

G83A < C I D,E,F,H,l,K,L,M,N,P,Q,R,S,T,y,W,Y; G84A,C,D,E l F,H,l,K,L,M,l|(P I Q,R,S,T.V,W,Y; 
Y85A,C,D,E 1 F > G,H,l 1 K 1 L,M,N 1 PAR,S I T ; ,V I W;A86C I D,E,F 1 G,HJ,K,L 1 M l N ; P,R,Q,^ 

i 43 5.! 
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(preferably A86Q); T87A,C,D I E,F,G I H,I,K,L,M 1 N,P,Q,R,S,V ) W,Y (preferably T87S); 
V88A.C,D,E,F,G.HJ.K,L 1 M > N,P,Q.R,S,T,W,Y;S89A.C l D,E,F f G,H,l,K > L,M,N,P,Q.R.T,V,W,Y 

(preferably S89A.T); G90A,C,D,E 1 F 1 H,I,K,L 1 M,N,P,Q,R,S I T,V,W ( Y; S91H.T; S92P,R,K,N; 
Q93P; A94C.P; P95A.E.D; I96A,E,D,P; G97R.K; S98P; Q99A,R,K,S; 1100V; S103C; S105C.P; 
5 T106C; C111R.K; T113E,D; V114I; G118N,R,K; T120E.D.S; S122R.K; P124R.K; Q125E.P; 
T127R.K; Q129E,D,Y,T; N130E,D,S; U31MT134C; N135P.T; V136P; E139C; P140E.D; 
G141C; G144C; G145C; S146C; F147VV,Y; I148C; S149C; G150E.D; Sjr,51P,E,D,T;..G155C; 
V156C;G160A I C,D,E.F,H,I,K,L ( M,N,P,Q,R,S,T.V,W,Y; j 

S161A,C,D.E,F,G ? H,I,K 1 L,M,N,P,Q I R,T;V,W,Y; G162A,C,D i E,F,H.I.K,L,^,N,P,Q,R.S,T.V 1 W,Y; 

10 N163A,C,D 1 E.F,G,H,I,K,L,M,P,Q.R 1 S,T^ 

S165A,C,D,E,F,G,H.I,K,L,M,N,P 1 Q I R,T,V,W,Y (preferably S165R); |l 



30 



35 



F166A,C,D,E l G,H,l,K I L,M l N,P,Q,R.S,T t V.W,Y (preferably F166.T.V); 



G167A 1 C,D,E 1 F,H,I,K 1 L,M,N,P,Q,R 1 S 1 T,V,W 1 Y; G168A,C,D I E > F.H I I > K,L,M,N,P I Q,R,S,T,V 1 W 1 Y; 
T169A,C,D,E,F,G l H.I,K,L,M,N,P,Q 1 R l Sy.W.Y; T170A 1 C,D,E.F,G,H 1 I I K,L.M,N,P,Q,R,S 1 V,W.Y; 
15 Y171F; Y172C; Q173C; E174P; V175P; T176N.P; L179I.V; S180R.K.N; S181R.K; L184P.V; 
T185R; L186I; R187P; and/or T188R.K. 

4 
* 

21. The variant of any one of claims 1-16 and 1 8-20 which comprises at least one of the 
following substitutions: iTIOY, A24S, V51T. E53Q, T82S, A86Q. T87S. I96A, G118N, S122R, 

20 N130S, L186I. * * 

1 ■ • 

f f 

f 

i 1 \ 1 

22. The variant of any one of claims; 1-16 and 18-19 which comprisesjat least one of the 
following substitutions: R38T; Q42G.P; ;R49T,Q; Q54N.R; A89S.T; H91SjT; N92S; S99A.Q; 
A120T; E125Q; T129Y.Q; M131L; T135N; Y147F; N151S; R165S; T166V,F: F171Y; V179I.L; 

25 preferably at least one of the following substitutions: R38T; N92S; A1 20T-; E1 25Q; M1 31 L; 
T135N; Y147F; N151S; R165S; and/or F171Y. T 



23. The variant of any one of claims 1-19 which comprises at least one of the following 
substitutions: A25S, T44S, A62S, P95A, V1001, 1114V, T176N, N180S, V184L, R185T. 

24. The variant of any one of claims 1 -23 which is not identical to any one of the following 
amino acid sequences: SEQ ID NO: 2. amino acids 1-188 of SEQ ID NO: 4, amino acids 1- 
188 of SEQ ID NO: 6, amino acids 1-188 of SEQ ID NO: 8, and amino acids 1-188 of SEQ ID 
NO: 10. 

.* i 

•. . 

25. The variant of any one of claims 1-24 which is not identical toiihe protease derived 
from Nocardiopsis dassonvillei NRRL 18133, provided the latter ha? afcleast 60% ^identity to 

j ili 
' AA ■ <V 
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SEQ ID NO: 2. 

26. The variant of any one of claims 1-25 which is not identical to the protease derived 
from Nocardiopsis sp. FERM P-10508, and/or Nocardiopsis dassonvillei strain ZIMET 43647 

5 provided the respective protease has at (east 60% identity to SEQ ID NO: 2. 

27. The variant of any one of claims 1-26 which is not identical to any protease that may 
belong to the prior art and which has at least 60% identity to SEQ ID NO: 2. 

10 28. The variant of any one of claims 1-27 which has amended ffrbperties, sych as an 
improved thermostability. 1 ; : j; : 

29. The variant of claim 28 which has a Tm of at least 83.1 °C admeasured by DSC in 
10mM sodium phosphate, 50 mM sodium chloride, pH 7.0. V 

15 

30. The variant of any one of claims 1-29 which derives from a strain of the genus 
Nocardiopsis. 

i 

31. The variant of claim 30 which derives from a strain of Nocardiopsis alba, Nocardiopsis 

■ 

20 antarctica, Nocardiopsis prasina, Nocardiopsis composta, Nocardiopsis dassonvillei, 

Nocardiopsis exhalans, Nocardiopsis halophila, Nocardiopsis halotolerans, Nocardiopsis 

« • 

kunsanensis, Nocardiopsis listed, Nocardiopsis lucentensis, Nocardiopsis metaliicus, 
Nocardiopsis sp. t Nocardiopsis synnemataformans, Nocardiopsis trefialosi, Nocardiopsis 
tropica, Nocardiopsis umidischolae, or Nocardiopsis xinjiangensis. j, j 

25 . ; j p 

32. The variant of claim 31 which derives from Nocardiopsis alba DSM 15647, 
Nocardiopsis dassonvillei NRRL 18133, Nocardiopsis dassonvillei sub'sp. dassonyillei DSM 
43235, Nocardiopsis prasina DSM 15648, Nocardiopsis prasina DSM 15649, Nocardiopsis sp. 
NRRL 18262, Nocardiopsis dassonvillei strain ZIMET 43647, or Nocardiopsis sp. FERM P- 

30 10508. 

33. A method for generating a protease variant of an improved property, the method 
comprising the following steps: 

(a) selecting a parent protease of at least 60% identity to SEQ ID NO: 2; 
35 (b) establishing a 3D structure of the parent protease by homology modelling using 

< 

the Fig. 2 structure as a model; and/or aligning the parent protease according to the alignment 
of Fig. 1 ; 



•«.. 
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(c) proposing at least one amino acid substitution, e.g. by: ; 

(i) subjecting the 3D structure of (b) to MD simulations at increased 
temperatures, and identifying ' regions in the amino acid sequence of the parent 
protease of high mobility (isotropic fluctuations); 
5 (ii) introducing disulfid bridges by way of cysteine substitutions (C-C); 

(iii) introducing proline substitutions (P); 

(iv) replacing exposed neutral amino acid residues with negatively charged 

amino acid residues (E,D); 

(v) replacing exposed neutral amino aicd residues with positively charged 

10 amino acid residues (R,K); 

(vi) replacing small amino, acid residues inside the protein with bulkier amino 

acid residues (W); 4 v, 

(vii) comparing by j homology alignment and/or \ { homology modelling 
according to step (c)(i) at least two related parent proteases and transferring amino 

15 acid residue differences inbetween these protease backbones, preferably from a 

backbone having the improver! property to a backbone not: having this improved 
property; ' 

(d) preparing a DNA sequence encoding the parent protease but for inclusion of a 
DNA codon of the at least one amino acid substitution proposed in steps (c)(H)-(c)(vii), or 

20 subjecting the parent DNA sequence to random mutagenesis, targetting at least one of the 
regions identified in step (c)(i); 

(e) expressing the DNA sequence obtained in step (d) in a host cell, and 

(h) selecting a host cell expressing a protease variant with an improved property. 

i i 
i 

25 34. A method of preparing a protease variant, the method comprising the steps of 

(a) cultivating the host ceil of claim 33(h) to produce a supernatant comprising the variant; and 

(b) recovering the variant. 

s 

35. An isolated nucleic acid sequence comprising a nucleic acid sequence which encodes 
30 the protease variant of any of claims i 1-32, or the protease variant obtainable according to 

»• ■ ■ 

claim 34. ! ! J;: 

* •* 
■ I" 

36. The nucleic acid sequence of claim 35 which is not identical to any one of the following 
nucleic acid sequences: Nucleotides 900-1466 of SEQ ID NO: 1, nucleotides 499-1062 of 

35 SEQ ID NO: 3, nucleotides 496-1059 of SEQ ID NO: 5, nucleotides 496-1059 of SEQ ID NO: 
7, and nucleotides 502-1 065 of SEQ ID NO: 9. 

i ■ * 

■ 
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37. The nucleic acid sequence of claim -35 which is not identical, to the nucleic acid 
sequence encoding the mature peptide part of the protease derived from Nocardiopsis 
dassonvillei NRRL 18133, provided this protease has at least 60% identify to SEQ ID NO: 2. 



■r«l 



5 38. The nucleic acid sequence of claim 35 which is not identical; to the nucleic acid 
sequence encoding the mature peptide part of the proteases derived from Nocardiopsis sp. 
FERM P-10508, and/or Nocardiopsis dassonvillei strain ZIMET 43647, provided the respective 
protease has at least 60% identity to SEQ ID NO: 2. 

10 39. A nucleic acid construct comprising the nucleic acid sequence of any one of claims 35- 
38 operably linked to one or more control sequences that direct the production of the protease 
variant in a suitable expression host. 



■ * 

40. A recombinant expression vector comprising the nucleic acid construct of claim 39. 

ix 

41. A recombinant host cell comprising the nucleic acid construct of claim 39 and/or the 
expression vector of claim 40. ss. 

» ■ ■ 

42. A method for producing the prptease variant of any one of claims 1-32, or the variant 
20 obtainable according to claim 34, the method comprising: 

r ' 

(a) cultivating the host cell of claim 41; to produce a supernatant comprising the variant; and 

i 

(b) recovering the variant 

43. A transgenic plant, or plant part* capable of expressing a protease variant of any one of 
25 claims 1-32, and/or a protease variant obtainable according to claim 34. 

t 

44. A transgenic, non-human animal, or products, or elements thereof, being capable of 
expressing a protease variant of any one of claims 1-32, and/or a protease variant obtainable 
according to claim 34. 

30 <*v 

45. A composition comprising at least one protease variant of ■an$* one of claims 1-32, 
and/or a protease variant obtainable according >to claim 34, and ifj 

(a) at least one fat soluble vitamin; 

(b) at least one water soluble vitamin; and/or 

» 

35 (c) at least one trace mineral. 

* 

46. The composition of claim 45 further comprising at least one enzyme selected from the 

■ 

47 
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following group of enzymes: galactanases, alpha-galactosidases, xylanases. endoglucanases, 
endo-1,3(4)-beta-glucanases, and phytases. 

47. The composition of any one of claims 45-46 which is an animal feed additive. { 

48. An animal feed composition having a crude protein content of:: 50 to 800 g/kg and 
comprising the protease variant of any one of claims 1-32, and/onthe protease variant 
obtainable according to claim 34, and/or the composition of any one of claims 45-47, 



10 49. A method for improving the nutritional value of an animal feed, wherein the protease 
variant of any one of claims 1-32, and/or the protease variant obtainable according to claim 
34, and/or the composition of any one of claims 45-48 is added to the feed. 

50. A method for the treatment of vegetable proteins, comprising the step of adding the 
15 protease variant of any one of claims 1-32, and/or the protease variant obtainable according to 
claim 34, and/or the composition of any one of claims 45-48 to at least one vegetable protein 
or protein source. 



51. Use of the protease variant of any one of claims 1-32, and/ori. the protease variant 



20 obtainable according to claim 34, and/or the composition of any one*bf claims 45-48 (i) in 

animal feed; (ii) in the preparation of animal feed; (iii) for improving the nutritional value of 

j 

animal feed; and/or (iv) for the treatment of vegetable proteins. 

52. Use of the protease variant of any one of claims 1-32, and/or the protease variant 

» 

25 obtainable according to claim 34 in detergents. 
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Abstract 

The invention relates to variants of a parent protease homologous to Nocardiopsis 
proteases, in particular variants of improved thermostability. The invention also relates to DNA 
sequences encoding such variants, their production in a recombinant ;host cell, as well as 
5 methods of using the variants, in particular within the field of animal feed and detergents. The 
invention furthermore relates to methods of generating and preparing protease variants of 
amended properties. : H 
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10508DK.ST25 
SEQUENCE LISTING 

<110> Novozymes A/S 
<120> Protease variants 
<130> 10508 
<160> 10 

<170> Patentin version 3.2 

* 

<210> 1 

<211> 1596 , ts 

<212> DNA : , A1I , 

<213> Nocardiopsis sp. NRRL 18262 (Protease 10 ) 



• •• 





<220> 

<221> CDS 

<222> (900) (1466) 

<400> 1 an 

acgtttggta cgggtaccgg tgtccgcatg tggccagaat gcccccttgc gacagggaac bo 

ggattcggtc ggtagcgcat cgactccgac aaccgcgagg tggccgttcg cgtcgccacg 120 

ttctgcgacc gtcatgcgac ccatcatcgg gtgaccccac cgagctctga atggtccacc 180 

gttctgacgg tctttccctc accaaaacgt gcacctatgg ttaggacgtt gtttaccgaa 240 

tgtctcggtg aacgacaggg gccggacggt attcggcccc gatcccccgt tgatcccccc 300 

aggagagtag ggaccccatg cgaccctccc ccgttgtctc cgccatcggt acgggagcgc 360 

tggccttcgg tctggcgctg tccggtaccc cgggtgccct cgcggccacc ggagcgctcc 420 

cccagtcacc caccccggag gccgacgcgg tctccatgca ggaggcgctc cagcgcgacc 480 

tcgacctgac ctccgccgag gccgaggagc tgctggccgc ccaggacacc gccttcg^gg 540 

tcgacgaggc cgcggccgag gccgccgggg acgcctacgg cggctccgtc ttcgacaccg 

agagcctgga actgaccgtc ctggtcaccg atgccgctgc ggtcgaggcc gtggaggcca 660 

ccggcgccgg gaccgagctg gtctcctacg gcatcgacgg tctcgacgag atcgtccagg 720 

agctcaacgc cgccgacgcc gttcccggtg tggtcggctg gtacccggac gtggcgggtg 780 

acaccgtcgt cctggaggtc ctggagggtt ccggagccga cgtcagcggc ctgctcgcgg 840 

acgccggcgt ggacgcctcg gccgtcgagg tgaccacgag cgaccagccc gagctctae 899 

gcc gac ate ate ggt ggt ctg gcc tac acc atg ggc ggc cgc tgt teg 947 
Ala Asp lie lie Gly Gly Leu Ala Tyr Thr Met Gly Gly Arg cys Ser 
15 10 15 

gtc ggc ttc gcg gcc acc aac gcc gcc ggt cag ccc ggg ttc gtc acc 
val Gly Phe Ala Ala Thr Asn Ala Ala Gly Gin Pro Gly Phe Val Thr 

20 25 30 

gcc ggt cac tgc ggc cgc gtg ggc acc; cag gtg acc ate ggc aac ggc 1043 
Ala Gly His cys Gly Arg val Gly Thr Gin val Thr lie Gly Asn Gly 



35 T 40 45 

agg ggc gtc ttc gag cag tec gtc ttc ccc ggc aac gac gcg gcc ttc 

Arg Gly val Phe Glu Gin ser val Phe Pro Gly Asn Asp Ala Ala Phe 
50 55 60 

. Page 1 . , . 
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600 



995 



1091 



Hi 



1Q508DK.ST25 

gtc cgc ggt acg tec aac ttc acg ctg acc aac ctg gtc age cgc tac 
val Arg Gly Thr Ser Asn Phe Thr Leu Thr Asn Leu val Ser Arg Tyr 
65 70 7S 80 

aac acc ggc ggg tac gec acg gtc gee ggt cac aac cag gee ccc ate 
Asn Thr Gly Gly Tyr Ala Thr Val Ala Gly His Asn Gin Ala Pro He 



85 



90 



95 



ggc tec tec gtc tgc cgc tec ggc tec acc acc ggt tgg cac tgc ggc 
Gly ser ser val cys Arg ser Gly ser Thr Thr Gly Trp His Cys Gly 



100 



105 



110 



acc ate cag gee cgc ggc cag teg gtg age tac ccc gag ggc acc gtc 
~ J - -,T y cln ser val ser Tyr Pro Glu Gly Thr val, 

125 



Thr lie Gin Ala Arg G 
115 



120 



acc aac atg acc egg acc acc gtg tgc gec gag ccc ggc gac tec ggc 
Thr Asn Met Thr Arg Thr Thr val cys Ala Glu pro Gly Asp ser Gly 
130 135 140 



t t 



ggc tec tac ate tec ggc acc cag gee cag ggc gtg acc tec ggc ggc, 
Gly Ser Tyr He Ser Gly^ Thr Gin Ala Gin Gly val Thr ser Gly Gig. 



145 



150 



155 



1139 



1187 



1235 



1283 



1331 



1379 



1427 



1476 



tec ggc aac tgc cgc acc ggc ggg acc acc ttc tac cag gag gtc acc,, 
Ser Gly Asn cys Arg Thr Gly Gly Thr Thr Phe Tyr Gin Glu val Thr 

165 il70 175 

ccc atg gtg aac tec tgg ggc gtc cgt etc egg acc tga tccccgcggt 
Pro Met Val Asn Ser Trp Gly val Arg Leu Arg Thr 

180 185 

tecaggegga ccgacggtcg tgacctgagt accaggcgtc cccgccgctt ccagcggcgt 1536 
ccgcaccggg gtgggaccgg gcgtggccac ggccccaccc gtgaceggae cgcccggcta 1596 




<210> 2 

<211> 188 

<212> PRT ,„ 

<213> Nocardiopsis sp. NRRL 18262 C "Protease 10') 

<400> 2 

Ala Asp lie lie Gly Gly Leu Ala Tyr Thr Met Gly Gly Ar 9 Cy 
1 5 10 15 



s Ser. 



val Gly Phe Ala Ala Thr Asn Ala Ala Gly Gin Pro Gly Phe Val Thr. 



20 



25 



Ala Gly His Cys Gly Arg val Gly Thr Gin val Thr lie Gly Asn Gly 
35 ' ' ' 40 45 

Arq Gly val Phe Glu Gin ser val Phe Pro Gly Asn Asp Ala Ala Phe 
50 55 60 



val Arg Gly Thr Ser Asn Phe Thr Leu Thr Asn Leu val Ser Arg Tyr 

ec 70 75 80 



65 



75 



Asn Thr Gly Gly Tyr Ala Thr val Ala Gly His Asn Gin Ala Pro He 

85 90 95 

* 
■ 
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10508DK.ST25 '/ 

Gly ser ser val Cys Arg Ser Gly Ser Thr Thr Gly Trp His Cys Gl^ 

100 105 HO 

Thr lie Gin Ala Arg Gly Gin Ser val ser Tyr Pro Glu Gly Thr val 

115 120 125 

Thr Asn Met Thr Arg Thr Thr Val Cys Ala Glu Pro Gly Asp ser Gly 
130 135 140 

Gly ser Tyr lie ser Gly Thr Gin Ala Gin Gly val Thr ser Gly Gly 

145 150 155 160 

Ser Gly Asn Cys Arg Thr Gly Gly Thr Thr Phe Tyr Gin Glu va] Thr 

165 170 1'5 

pro Met val Asn Ser Trp Gly val Arg Leu Arg Thr 

180 185 r 

<210> 3 

<211> 1065 i. 

<2ll> SSJardiopsis dassonvillei subspecies dassonvillei dsm 43235 ("Protease 
18") * 

« 

<220> 

<221> CDS 

<222> (1) . . (1062) 

<220> 

<221> mat_peptide 
<222> (4995.-C1062) 



45 



90 



<400> 3 _ 
get ccg gec ccc gtc ccc cag acc ccc gtc gec gac gac age gee 
Ala pro Ala Pro val pro Gin Thr Pro val Ala Asp Asp Ser Ala 
-165 -160 -155 

gec age atg acc gag gcg etc aag cgc gac etc gac etc acc teg 
Ala Ser Met Thr Glu Ala Leu Lys Arg Asp Leu Asp Leu Thr Ser 
-150 -145 -140 

gee gag gee gag gag ctt etc teg gcg cag gaa gee gec ate gag 135 
Ala Glu Ala Glu Glu Leu Leu ser Ala Gin Glu Ala Ala lie Glu. 
-135 -130 -125 

180 



acc gac gec gag gec acc gag gec gcg ggc gag gec tac ggc ggc 
Thr Asp Ala Glu Ala Thr Glu Ala Ala Gly Glu Ala Tyr Gly Gly. 
-l5o -115 -no 

tea ctg ttc gac acc gag acc etc gaa etc acc gtg ctg gtc acc gac 
ser Leu Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 
-105 -100 -95 

gee tec gee gtc gag gcg gtc gag gee acc gga gee cag gee acc gtc 
Ala ser Ala val Glu Ala val Glu Ala Thr Gly Ala Gin Ala Thr Val 
-90 -85 -80 -75 

gtc tec cac ggc acc gag ggc ctg acc gag gtc gtg gag gac etc aac 
val ser His Gly Thr Glu Gly Leu Thr Glu val val Glu Asp Leu Asn 

-70 -65 -60 

. Page 3 
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324 
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ggc 

Gly 


gcc 
Ala 


gag 

Glu 


gtt 
val 
-55 


ccc gag age gtc 
Pro Glu ser val 


etc 1 

Leu 

-50. 

• 


ggc 
Giy 


tgg tac 
Trp Tyr 


ccg gac 
pro Asp 
-45 


gtg gag 

val Glu 


age 
ser 


gac 
Asp 


acc 
Thr 
-40 


gtc 
val 


gtg 

val 


gtc gag gtg 
val Glu val 
-35 


i 

ctg 
Leu . 


gag 
Glu 


ggc 
Gly 


tec gac gcc gac gtc 
Ser Asp Ala Asp val, 

- 30 f,; 


gcc 
Ala 


gcc 
Ala 
-25 


ctg 
Leu 


etc gcc gac gcc ggt 
Leu Ala Asp Ala Gly 

-20 


gtg 1 

vai 


gac 
Asp 


xcc 
Ser 

■ 


4» J" 

tec 
ser 
-15 


teg gtc 
Ser val 


egg gtd : 
Arg val 


gag 
Glu 
-10 


gag 
Glu 


gcc 
Ala 


gag 
Glu 


gag gcc ccg cag 
Glu Ala Pro Gin 
-5 


gtc 

val 


tac 
Tyr 
-1 


T~ £ m 

gcc 

Ala 
1 


gac 
Asp 


ate ate 
lie lie 

• 


ggc ggc 

Gly Gly 
5 


ctg 
Leu 


gcc 
Ala 


tac 
Tyr 


tac 
Tyr 
10 


atg 
Met 


ggc ggc cgc 

Gly Gly Arg 


tgc tec 
c^s Ser 


g xc 
val 


ggc 

Gly 


ttc gcc gcg acc 
Phe Ala Ala Thr 
20 


aac 
Asn 


age 
ser 


gcc 
Ala 
25 


Gly 


cag 
Gin 


ccc ggt ttc 
Pro Gly Phe 
30 


gtc 
val 


acc 
Thr 


gcc 
Ala 


ggc 

Gly 


cac tgc 
His Cys 
35 


ggc acc 
Gly Thr 


gtc 
val 


ggc 

Gly 
40 


acc 
Thr 


ggc 

Giy 


gtg 

vai 


acc ate ggc 
Thr lie Giy 
45 


aac 
Asn. 


ggc 
Gly 


acc 
Thr 


ggc 

Gly 
50 


acc ttc cag aac 
Thr Phe Gin Asn 


teg 
Ser 
55 


gtc 

val 


ttc 
Phe 


ccc ggc aac gac gcc 
Pro Gly Asn Asp Ala 
60 


■ 

gcc 4 ttc 
Ala Phe 

» 


gtc 
val 
65 


cgc 
Arg 


ggc acc 
Gly Thr 


tec aaS 
Ser Ash 
70 


ttc 
Phe 


acc 
Thr 


ctg 
Leu 


acc aac 
Thr Asn 
75 


ctg gtc teg 
Leu Val Ser 


cgc 

Arg 


tac 
Tyr 
80 


aac 
Asn 


tec ggc ggc tac ca(j 
ser Gly Giy Tyr Gin 

85 iy; 


teg 
Ser 


gtg 
val 


acc 
Thr 


ggt 

Gly 
90 


acc 
Thr 


age cag gcc 
Ser Gin Ala 


ccg 
Pro 
95 


gcc 
Ala 


ggc 
Giy 


teg 
Ser 


gcc gtg 
Ala val 
100 


tgc cg*T 
cys Arg 


tec 
ser 


ggc 

Giy 


tec 
Ser 
105 


acc 
Thr 


acc 
Thr 


ggc tgg cac 
Gly Trp His 
110 


tgc 
Cys 


ggc 

Gly 


acc ate cag gcc cgc aac 
Thr lie Gin Ala Arg Asn 
115 


eag 
Gin 


acc 
Thr 
120 


gtg 

vai 


cgc tac 
Arg Tyr 


ccg cag ggc 
Pro Gin Giy 
125 


acc 
Thr 


gtc 
val 


tac 
Tyr 


teg 
Ser 
130 


etc acc 
Leu Thr 


cgc acc 
Arg Thr 


aac 
Asn 
135 


83 


tgc 
cys 


gcc 

Ala 


gag 

Glu 


ccc ggc gac 
pro Giy Asp 
140 


tec 
Ser 


ggc 

Gly 


ggt 

Gly 
145 


teg 
Ser 


ttc ate 
Phe lie 


tec ggc 
ser Gly 
150 


teg 
ser 


Gin 


gcc 
Ala 


cag 
Gin 


ggc 
Gly 
155 


gtc acc tec 
Val Thr ser 


ggc ggc 

Gly. Gly 
160 


tec 
ser 


ggc 

Gly 


aac tgc 
Asn cys 


tec gtc 
ser va* 
165 


ggc 

Gly 


ggc 
Gly 


acg 
Thr 


acc tac tac cag gag 
Thr Tyr Tyr Gin Glu 
170 


gtc acc 
val Thr 
175 

i 


ccg atg ate aac tec tg§ 
Pro Met He Asn Ser Trp 

180 


ggt 

Gly 


gtc 
val 


agg 

Arg 
185 


ate 
He 


egg 

Arg 


acc taa 
Thr 






• 






: \i 

♦ ft: 



if 

<210> 41 !! 
<211> 354 
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^212^ PRT ' 

<213> Nocardiopsis dassonvillei subspecies dassonvillei dsm 43235 ("Protease 
18") .; 

<400> 4 

i 

i 

Ala Pro Ala Pro Val Pro Gin Thr Pro val Ala Asp Asp Ser AlaU 
-165 -160 -155 l> 

Ala Ser Met Thr Glu Ala Leu Lys Arg Asp Leu Asp Leu Thr Ser|| 
-150 -145 -140 »' 

w 

Ala Glu Ala Glu Glu Leu Leu Ser Ala Gin Glu Ala Ala lie Glu ; | 
-135 • -130 -125 

Thr Asp Ala Glu Ala Thr Glu Ala Ala Gly Glu Ala Tyr Gly Gly 
-120 -115 -HO 

ser Leu Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu Val Thr Asp 
-105 -100 -95 

Ala Ser Ala val Glu Ala val Glu Ala Thr Gly Ala Gin Ala Thr val 
-90 -85 -80 -75 

Val ser His Gly Thr Glu Gly Leu Thr Glu val Val Glu Asp Leu Asn ; 

-70 -65 -60 



Gly Ala Glu val Pro Glu ser val Leu Gly Trp Tyr Pro Asp val Glu. 

-55 -50 -45 ! J 

Ser Asp Thr Val Val val Glu val Leu du Gly Ser Asp Ala Asp val* 

-40 -35 -30 ,{ 



'V* 



Ala Ala Leu Leu Ala Asp Ala Gly val Asp Ser ser Ser val Arg val 
-25 ' -20 -15 

Glu Glu Ala Glu Glu Ala Pro Gin Val Tyr Ala Asp lie lie Gly Gly 
-10 -5 -11 5 

Leu Ala Tyr Tyr Met Gly Gly Arg cys Ser val Gly Phe Ala Ala Thr 

10 * 15 20 

Asn ser Ala Gly Gin Pro Gly Phe Val Thr Ala Gly His cys Gly Thr 

25 .30 35 

val Gly Thr Gly val Thr He Gly Asn Gly Thr Gly Thr Phe Gin Asn 
40 45 50 

ser val Phe Pro Gly Asn Asp Ala Ala Phe val Arg Gly Thr ser As$ 
55 60 65 70" 

Phe Thr Leu Thr Asn Leu val Ser Arg tyr Asn ser Gly Gly Tyr Gliji) 

75 80 85 

Page 5 
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ser val Thr Gly Thr Ser Gin Ala Pro Ala Gly ser Ala val cys Arg 

90 95 100 

Ser Gly ser Thr Thr Gly Trp His cys Gly Thr lie Gin Ala Arg Asn 
105 110 115 

* ■ 

Gin Thr val Arg Tyr Pro Gin Gly Thr Val Tyr Ser Leu Thr Arg Thr 
120 125 130 





Asn val cys Ala Glu pro Gly Asp Ser Gly Gly Ser Phe lie ser Gly 
135 140 145 150 

: ;' 

ser Gin Ala Gin Gly val Thr Ser Gly Gly Ser Gly Asn Cys ser val 

155 160 165 

Gly Gly Thr Thr Tyr Tyr Gin Glu val Thr Pro Met lie Asn Ser Trp 

170 175 180 

Gly val Arg lie Arg Thr 
185 

<210> 5 
<211> 1062 
<212> DNA 

<213> Nocardiopsis prasina DSM 15648 ("Protease IT 1 ) 

i 

<220> 

<221> CDS 

<222> CD.. (1059) ' , 

<220> 1 • h 

<221> mat_peptide , 
<222> (496) (1059) : I 

i 

* «■ 

<400> 5 ■ ; \: 

gcc acc gga ccg etc ccc cag tea ccc acc ccg gag gec gac gcc 45 

Ala Thr Gly Pro Leu Pro Gin ser Pro Thr pro Glu Ala Asp Ala* 

-165 -160 -155 

gtc tec atg cag gag gcg etc cag cgc gac etc ggc ctg acc ccg 90 

val ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr pro 

-150 -145 -140 

ctt gag gcc gat gaa ctg ctg gcc gcc cag gac acc gcc ttc gag 135 

Leu Glu Ala Asp Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu 

-135 -130 -125 

gtc gac gag gcc gcg gcc gcg gcc gcc ggg gac gcc tac ggc ggc 180 

val Asp Glu Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly 

-120 -115 -110 

tec gtc ttc gac acc gag acc ctg gia ctg acc gtc ctg gtc acc gac 228 

ser val Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 

-105 -100 -95 -90 

gcc gcc teg gtc gag get gtg gag gcc acc ggc gcg ggt acc gaa etc 
Ala Ala ser val Glu Ala val Glu Ala Thr Gly Ala Gly Thr Glu Leu 

-85 -80 -75 

Page 6 
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gtc 

val 


tec 
Ser 


tac 
Tyr 


ggc 
Gly 
-70 


ate 
lie 


g?g 

Glu 


ggc 

Gly 


gcc 
Ala 


gcc 
Ala 


gac 
Asp 
-55 


gcc 
Ala 


gtc 
val 


ccc 
pro 


ggc 

Gly 


ggt 

Gly 


gac 
Asp 
-40 


acc 
Thr 


gtc 
val 


gtc 
val 


ctg 
Leu 

• 

« 


gag 
Glu 
-35 


age 
Ser 
-25 


ggc 

Gly 


ctg 
Leu 


etc 
Leu 


gcc 
Ala 


gac 
Asp 
-20 


gcc 
Ala 


acc 
Thr 


age 
ser 


agt 
ser 


gcg 

Ala 


cag 
Gin 
-5 


ccc 
pro 


gag 

Glu 


gcc 
Ala 


tac 
Tyr 


acc 
Thr 
10 


atg 
Met 


ggc 
Gly 


ggc 
Gly 


cgc 
Arg 


gcc 
Ala 


gcc 

Ala 
25 


ggt 
Gly 


cag 
Gin 


ccc 
pro 


99* 
Gly 


ttc 
Phe 
30 


ggc 

Gly 
40 


acc 
Thr 


cag 
Gin 


gtg 
val 


age 
Ser 


ate 
He 
45 


ggc 

Gly 


ate 
He 


ttc 
Phe 


ccg 
Pro 


ggc 

Gly 


aac 
Asn 
60 


gac 
Asp 


gcc 
Ala 


acg 
Thr 


ctg 
Leu 


acc 
Thr 


aac 
Asn 
75 


ctg 
Leu 


gtc 
val 


age 
ser 


gtc 
val 


gcc 

Ala 


ggc 
Gly 
90 


cac 
His 


aac 
Asn 


cag 
Gin 


gcg 

Ala 


ggc 

Gly 


tec 
Ser 
105 


acc 
Thr 


acc 
Thr 


ggc 
Gly 


tgg 
Trp 


cac 
His 
110 


teg 
Ser 
120 


gtg 

val 


age 
Ser 


tac 
Tyr 


ccc 
Pro 


gag 

Glu 

125 


ggc 

Gly 


gtg 

Val 


tgc 
Cys 


gcc 
Ala 


g?g 

Glu 


ccc 
pro 
140 


ggc 

Gly 


gac 
Asp 


cag gcc 
Gin Ala 


cag 
Gin 


ggc 
Gly 
155 


gtc 
val 


ace 
Thr 


tec 
Ser 


999 
Gly 


acc 
Thr 


acc 
Thr 
170 


ttc 
Phe 


tac 
Tyr 


cag 
Gin 


gag 
Glu 


gtc 

val 


cgt 
Arg 
185 


etc 
Leu 


egg 
Arg 


acc 
Thr 


taa 

? 

i 
■ 





-65 -60 



-50 -45 



-30 



Asp Ala ser Ala val Glu Val 
-15 -IP 



-1 1 



is 1 



20 



ice gcc ggt cac tgt ggc cgc gtg 
hr Ala Gly His cys Gly Arg val 

! 35 

fgc cag ggc gtc ttc gag cag tec 
;iy Gin Gly Val Phe Glu Gin ser 
50 55 

:tc gtc cgc ggc acg tec aac ttc 
•he val Arg Gly Thr Ser Asn Phe 
65 70 



80 « v 85 



115 



130 135 



,145 150 

jc ggc tec ggc aac tgc cgc acc ggc 

y Gly ser Gly Asn Cys Arg Thr Gly 
160 165 



val Thr Pro Met val Asn ser Trp Gly 
175 180 



Page 7 



SI 

a 
ft 



324 



372 



420 



468 



516 



564 



612 



660 



708 



756 



ccc ate ggc tec tec gtc tgc cgc tec 804 
Pro lie Gly Ser Ser val Cys Arg Ser 
95 100 !' 



852 



900 



948 



996 



aac tec tgg gg^c 1044 



1062 



, i 



) 



1O508DK.ST25 << 



I 



<210> 6 
<211> 353 

<212> PRT ; „ B1 „„_ 

<213> Nocardiopsis prasina DSM 15648 ("Protease 11 ) 
<400> 6 

Ala Thr Gly Pro Leu Pro Gin Ser Pro Thr Pro Glu Ala Asp Ala 
-165 -160 -155 

val ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr pro 
-150 -145 -140 

Leu Glu Ala Asp Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu 

-135 -130 -125 

val Asp Glu Ala Ala Ala Ala Ala Ala Gly Asp Ala Tyr Gly Gly 
-120 -115 -110 {) 

Ser val Phe asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 
-105 -100 -95 y90 

j; 

Ala Ala ser val Glu Ala val Glu Ala Thr Gly Ala Gly Thr Glu Leu 

-85 -80 -75 . { 

■ 

val Ser Tyr Gly lie Glu Gly Leu Asp -Glu lie lie Gin Asp Leu Asn 

-70 -65 -60 

Ala Ala Asp Ala val Pro Gly val val Gly Trp Tyr Pro Asp val Ala 
-55 -50 -45 

Gly Asp Thr val val Leu Glu val Leu Glu Gly ser Gly Ala Asp val 
* -40 -35 -30 

Ser Gly Leu Leu Ala Asp Ala Gly val Asp Ala ser Ala val Glu val 
-25 -20 -15 -10 

Thr Ser Ser Ala Gin Pro Glu Leu Tyr Ala Asp lie lie Gly Gly Leu 

-5 -11 5 

• It 

Ala Tyr Thr Met Gly Gly Arg Cys ser val Gly Phe Ala Ala Thr Ash ' 
10 15 20 iV 

- • 

Ala Ala Gly Gin Pro Gly Phe val Thr Ala Gly His cys Gly Arg val' 



25 30 35 



it 



Gly Thr Gin val Ser He Gly Asn Gly Gin Gly Val Phe Glu Gin ser 
40 45 50 55 

He Phe Pro Gly Asn Asp Ala Ala Phe val Arg Gly Thr Ser Asn Phe 

60 65 70 
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10508DK.ST25 



* » 



Thr Leu Thr Asn Leu val Ser Arg Tyr Asn Thr Gly Gly Tyr Ala Thr 

75 80 85 i ». 

val Ala Gly His Asn Gin Ala Pro lie Gly Ser ser yal Cys Arg ser 
90 95 100 V 



Gly ser Thr Thr Gly Trp His cys Gly Thr lie Gin Ala Arg Gly Gin 
105 110 115 

Ser Val ser Tyr Pro Glu Gly Thr val Thr Asn Met Thr Arg Thr Thr 
120 125 130 135 

val cys Ala Glu Pro Gly Asp Ser Gly Gly Ser Tyr He Ser Gly Asn 

140 145 150 

Gin Ala Gin Gly val Thr Ser Gly Gly Ser Gly Asn Cys Arg Thr Gly 

155 160 165 




Gly Thr Thr Phe Tyr Gin Glu val Thr Pro Met val Asn Ser Trp Gly 
170 175 180 



\- 
t. 

« ■ 

■ 
■ 

V 

.1. 




val Arg Leu Arg Thr 
185 

<210> 7 

<211> 1062 

<212> DNA „ „„- 

<213> Nocardiopsis prasina DSM 15649 C "Protease 35') 

■ 

<220> 

<221> CDS 

<222> CI) . . (1059) 

<220> 

<221> mat^peptide 
<222> (496).. (1059) 

<400> 7 

gcc acc gga cca etc ccc cag tea ccc acc ccg gag gcc gac gcc 
Ala Thr Gly Pro Leu Pro Gin Ser Pro Thr Pro Glu Ala Asp Ala 
-165 -160 -155 

gtc tec atg cag gag gcg etc cag cgc gac etc gac ctg acc ccg 
val ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr Pro. 
-150 -145 -140 

ctt gag gcc gat gaa ctg ctg gcc gcc cag gac acc gcc ttc gag. 
Leu Glu Ala Asp Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu' 
-135 -130 -125 ' . 

gtc gac gag gcc gcg gcc gag gcc gcc got gac gcc tac ggc ggc • . 
val asp Glu Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr Gly Gly; 
-120 -115 -110 

tec gtc ttc gac acc gag acc ctg gaa ctg acc gtc ctg gtc acc gac 
Ser val Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 
-105 -100 -95 -90 

page 9 



45 



90 



135 



180 



228 
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tec gec gcg gtc gag gcg gtg gag gec acc ggc gec ggg acc gaa ctg 276 
ser Ala Xla val Glu All vaT Glu Ala Thr Gly Ala Gly Thr Glu Leu 

-85 :-80 -75 





m 

cag gec cag ggc gtc acc tec ggc ggc tec ggc aac tgc cgc acc gge 
Gin Ala Gin Gly val Thr Ser Gly Gly ser Gly Asn cys Arg Thr fly 

155 160! 165 



ggg acc acc ttc tac cag gag gtc acc ccc atg gtg aac tec tgg ggc 
Gly Thr Thr Phe Tyr Gin Glu Val Thr Pro Met Val Asn ser Trp Gly 
170 175 ; 180 



Page 10 



324 



gtc tec tac ggc ate acg ggc etc gac gag ate gtc gag gag etc aac 

val Ser Tyr Gly He Thr Gly Leu Asp Glu lie val Glu Glu Leu Asn 

-70 -65 j -60 

gec gec gac gec gtt ccc ggc gtg gtc ggc tgg tac ccg gac gtc gcg 372 

Ala Ala Asp Ala val Pro Gly val val .Gly Trp Tyr Pro Asp val Ala 
-55 -50 -45 r 



420 



468 



516 



564 



612 



660 



ggt gac acc gtc gtg ctg gag gtc ctg gag ggt tec ggc gec gac gtg 
Gly Asp Thr Val val Leu Glu val Leu ; Glu Gly ser Gly Ala Asp Val t; 
-40 . -35 J -30 

■ 

ggc ggc ctg etc gee gac gee ggc gtg gac gee teg gcg gtc gag gtg 
Gly Gly Leu Leu Ala Asp Ala Gly Val Asp Ala Ser Ala val Glu val 
-25 -20 -15 -10 

acc acc acc gag cag ccc gag ctg tac gee gac ate ate ggc ggt ctg 
Thr Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp lie lie Gly Gly Leu 

-5 -11 5 

gee tac acc atg ggc ggc cgc tgt teg gtc ggc ttc gcg gee acc aac 
Ala Tyr Thr Met Gly Gly Arg Cys Ser val Gly Phe Ala Ala Thr Asn 
10 15 20 

gee gee ggt cag ccc ggg ttc gtc acc gec ggt cac tgt ggc cgc gtg 
Ala Ala Gly Gin Pro Gly Phe val Thr Ala Gly His Cys Gly Arg val 
25 30 35 

ggc acc cag gtg acc ate ggc aac ggc egg ggc gtc ttc gag cag tec 
Gly Thr Gin Val Thr lie Gly Asn Gly Arg Gly val Phe Glu Gin Ser 
40 45 i 50 55 f 

ate ttc ccg ggc aac gac gee gee ttc gtc cgc gga acg tec aac ttc 
lie Phe Pro Gly Asn Asp Ala Ala Phe val Arg Gly Thr ser Asn Phe 

60 |65 70 r 

acg ctg acc aac ctg gtc age cgc tac aac acc ggc ggc tac gec acd 
Thr Leu Thr Asn Leu val ser Arg Tyr :Asn Thr Gly Gly Tyr Ala Thr,. 

75 80 85 

gtc gee ggt cac aac cag gcg ccc ate ggc tec tec gtc tgc cgc tec 
val Ala Gly His Asn Gin Ala Pro lie Gly Ser Ser val cys Arg Ser 
go 95 100 

ggc tec acc acc ggt tgg cac tgc ggc acc ate cag gee cgc ggc cag 
GTy ser Thr Thr Gly Trp His Cys Gly Thr He Gin Ala Arg Gly Gin 
105 HO 115 

teg gtg age tac ccc gag ggc acc gtc acc aac atg acg egg acc acc 

Ser val ser Tyr Pro Glu Gly Thr val Thr Asn Met Thr Arg Thr Thr 

120 125 130 135 

i 

gtg tgc gee gag ccc ggc gac tec ggc ggc tec tac ate tec ggc aac 948 
val cys Ala Glu Pro Gly Asp ser Gly Gly Ser Tyr lie ser Gly Asn 

140 145 150 



708 



756 



804 



852 



900 



996 



1044 





10508DK.ST25 

gtc cgt etc egg acc taa 
val Arg Leu Arg Thr 
185 



<210> 8 

<211> 353 

<212> PRT ■ „„ __ M> 

<213> Nocardiopsis prasina dsm 15649 ("Protease 35 ) 

<400> 8 



1062 




* 

Ala Thr Gly Pro Leu Pro Gin ser Pro Thr Pro Glu Ala Asp Ala 
-165 -160 -155 ! : 

w 

I . 

_ I 

val ser Met Gin Glu Ala Leu Gin Arg Asp Leu Gly Leu Thr Pro 
-150 -145 -140 

» 

• * 

Leu Glu Ala Asp Glu Leu Leu Ala Ala Gin Asp Thr Ala Phe Glu 
-135 -130 -125 

Val Asp Glu Ala Ala Ala Glu Ala Ala Gly Asp Ala Tyr Gly Gly 
-120 -115 -HO 

ser val Phe Asp Thr Glu Thr Leu Glu Leu Thr val Leu val Thr Asp 
-105 -100 -95 -90 

Ser Ala Ala val Glu Ala val Glu Ala Thr Gly Ala Gly Thr Glu Leu 

-85 -80 -75 

val Ser Tyr Gly lie Thr Gly Leu Asp^Glu lie val Glu Glu Leu Asn 

-70 -65 -60 *. 

\. 

Ala Ala Asp Ala val Pro Gly val val Gly Trp Tyr Pro Asp val Ala 

-55 -50 -45 ; j 

m 

h 

Gly Asp Thr val val Leu Glu val Leu Glu Gly Ser Gly Ala Asp val 
-40 -35 -30 




Gly Gly Leu Leu Ala Asp Ala Gly Val Asp Ala Ser Ala val Glu val 
-25 -20 -15 -10 

Thr Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp lie lie Gly Gly Leu 

-5 -11 5 

Ala Tyr Thr Met Gly Gly Arg cys ser Val Gly Phe Ala Ala Thr Asn 
10 15 20 

Ala Ala Gly Gin Pro Gly Phe val Thr Ala Gly His Cys Gly Arg val 
25 30 35 

Gly Thr Gin val Thr He Gly Asn Gly Arg Gly Val Phe Glu Gin sef 
40 45 50 55; 
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He Phe Pro Gly Asn Asp Ala Ala Phe val Arg Gly Thr ser Asn Phe 

60 65 70 

Thr Leu Thr Asn Leu val ser Arg Tyr Asn Thr Gly Gly Tyr Ala Thr 

75 80 85 

val Ala Gly His Asn Gin Ala Pro lie Gly Ser Ser val Cys Arg ser 
90 , 95 100 

Gly Ser Thr Thr Gly Trp His Cys Gly Thr He Gin Ala Arg Gly Gin 
105 110 115 

ser Val Ser Tyr Pro Glu Gly Thr val Thr Asn Met Thr Arg Thr Thr 
120 125 130 135 

* 

i 

Val Cys Ala Glu pro Gly Asp Ser Gly Gly Ser Tyr lie Ser Gly Ash 

140 145 150 . 

Gin Ala Gin Gly val Thr ser Gly Gly ser Gly Asn cys Arg Thr Gly 

155 160 165 

» 

4 

Gly Thr Thr Phe Tyr Gin Glu val Thr pro Met val Asn ser Trp Gly 
170 175 180 

val Arg Leu Arg Thr 
185 

<210> 9 

<211> 1068 

<212> DNA 

<213> Nocardiopsis alba DSM 15647 ("Protease 08") 

ii 

<220> 

<221> CDS 

<222> (1)..(1065) 




ft 




n ■ 



n 



45 



90 



<220> 

<221> mat_peptide 
<222> (502). .(1065) 

<400> 9 

gcg acc ggc ccc etc ccc cag tec ccc acc ccg gat gaa gee gag 
Ala Thr Gly Pro Leu Pro Gin Ser Pro Thr Pro Asp Glu Ala Glu: 
-165 -160 -1S5 

gec acc acc atg gtc gag gec etc cag cgc gac etc ggc ctg tec 
Ala Thr Thr Met Val Glu Ala Leu Gin Arg Asp Leu Gly Leu Ser 
-150 -145 -140 

ccc tct cag gee gac gag etc etc gag gcg cag gec gag tec ttc 135 
Pro ser Gin Ala Asp Glu Leu Leu Glu Ala Gin Ala Glu ser Phe 
-135 -130 -125 

gag ate gac gag gee gee acc gcg gec gca gec gac tec tac ggc 180 
Glu He Asp Glu Ala Ala Thr Ala Ala Ala Ala Asp Ser Tyr Gly 
-120 -115 -U0 

ggc tec ate ttc gac acc gac age etc acc ctg acc gtc ctg gtc acc 228 
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Gly ser He Phe Asp Thr Asp Ser Leu Thr*Leu Thr Val Leu val Thr 
-105 -100 -95 

gac gcc tec gec gtc gag gcg gtc gag : gcc gcc ggc gcc gag gcc aag 276 
Asp Ala ser Ala val Glu Ala val GluiAla Ala Gly Ala Glu Ala Lys 
-90 -85 -80 

gtg gtc teg cac ggc atg gag ggc ctg gag gag ate gtc gcc gac ctg 324 
val val SeF His Gly Met Glu Gly LeS Glu Glu lie val Ala Asp Leu 
-75 -70 -65 -60 

aac gcg gcc gac get cag ccc ggc gtc gtg gac tgg tac ccc gac ate 372 
Asn All Ala Asp Ala Gin Pro Gly val Val Gly Trp Tyr Pro Asp He 

-55 -50 -45 





cac tec gac acg gtc gtc etc gag gtc .etc gag gac tec gat gcc gac 
His ser Asp Thr val val Leu Glu val Leu Glu Gly Ser Gly Ala Asp 

-40 -35 -30 

qtq gac tec ctg etc gcc gac gcc ggt gtg gac ace gcc gac gtc aag 
val Asp ser Leu Leu Ala Asp Ala Gly val Asp Thr Ala Asp val Lys 
-25 -20 -15 

gtg gag age acc acc gag cag ccc gag ctg tac gcc gac ate ate ggc 
Val Glu ser Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp He lie Gly 
-10 -5 -11 5 | 

ggt etc gcc tac acc atg ggt ggg cgc'tgc teg gtc ggc ttc gcg gcc 
Gly Leu Ala Tyr Thr Met Gly Gly Arg : Cys Ser val Gly Phe Ala Ala* 

10 15 20 

acc aac gcc tec ggc cag ccc gag ttc gtc acc gcc ggc cac tgc ggc 
Thr Asn Ala ser Gly Gin Pro Gly Phe val Thr Ala Gly His Cys Gly 

25 30 35 

acc gtc ggc acc ccg gtc age ate ggc aac ggc cag ggc gtc ttc gag 
Thr val Gly Thr Pro val ser lie Gly Asn Gly Gin Gly val Phe Glu 
40 45 50 

cgt tec gtc ttc ccc ggc aac gac tec gcc ttc gtc cgc ggc acc teg 
Arg ser val Phe Pro Gly Asn Asp ser Ala Phe val Arg Gly Thr Ser 
55 ; 60 65 

aac ttc acc ctg acc aac ctg gtc age ;cgc tac aac acc ggt gat tac 
Asn Phe Thr Leu Thr Asn Leu Val ser Arg Tyr Asn Thr Gly Gly Tyr 
70 75 80 85 

gcg acc gtc tec ggc tec teg cag gcg gcg ate ggc teg cag ate tge 
Ala Thr val ser Gly Ser ser Gin Ala Ala He Gly ser Gin lie Cys 

90 95 100 

cgt tec ggc tec acc acc ggc tgg cac tgc gac acc gtc cag gcc cgc 
Arg ser Gly ser Thr Thr Gly Trp His cys Gly Thr val Gin Ala Arg 

ggc cag acg gtg age tac ccc cag ggc acc gtg cag aac ctg acc cgc 
Gly Gin Th? Val ser Tyr Pro Gin Gly Thr Val Gin Asn Leu Thr Arg' 
120 125 130 

acc aac gtc tgc gcc gag ccc ggt gac tec ggc ggc tec ttc ate tec 
Thr Asn val cys Ala Glu Pro Gly Asp Ser Gly Gly ser Phe He ser 
135 140 145 

ggc age cag gcc cag ggc gtc acc tec ggt ggc tec ggc aac tgc tec 
Gly ser Gin Ala Gin Gly val Thr ser Gly Gly ser Gly Asn cys ser 
150 155 160 lo5 

ttc ggt ggc acc acc tac tac cag gag gtc aac ccg atg ctg age age 
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420 



468 



516 



564 



612 



660 



708 



756 



804 



852 



900 



948 



996 



1044 
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Phe Gly Gly Thr Thr Tyr Tyr Gin Glu val Asn Pro Met Leu Ser ser 

170 175 180 



tgg 99t ctg acc ctg cgc acc tga \\ 
Trp Gly Leu Thr Leu Arg Thr 

185 

: \ 

<210> 10 
<211> 355 

<212> PRT MlfX 
<213> Nocardiopsis alba DSM 15647 C "Protease 08") 

<400> 10 

Ala Thr Gly pro Leu Pro Gin Ser Pro Thr Pro Asp Glu Ala Glu 
-165 -160 -155 

Ala Thr Thr Met val Glu Ala Leu Gin Arg Asp Leu Gly Leu Ser 

-150 -145 -140 

Pro ser Gin Ala Asp Glu Leu Leu Glu Ala Gin Ala Glu Ser Phe 
-135 -130 -125 

Glu He Asp Glu Ala Ala Thr Ala Ala Ala Ala Asp ser Tyr Gly* 
-120 -115 -HO 

* 

4 

Gly Ser lie Phe Asp Thr Asp Ser Leu Thr Leu Thr val Leu val Thr 
-105 -100 -95 ;. 

i • 

t 

* • 

Asp Ala ser Ala Val Glu Ala val Glu Ala Ala Gly Ala Glu Ala Lys 
-90 -85 -80 

val val Ser His Gly Met Glu Gly Leu Glu Glu lie val Ala Asp Leu 

-75 -70 -65 -60 

Asn Ala Ala Asp Ala Gin Pro Gly val val Gly Trp Tyr Pro Asp lie 

-55 -50 -45 

His Ser Asp Thr val val Leu Glu val Leu Glu Gly Ser Gly Ala Asp 

-40 -35 -30 

val Asp Ser Leu Leu Ala Asp Ala Gly val Asp Thr Ala Asp val Lys 
y -25 -20 -15 

val Glu ser Thr Thr Glu Gin Pro Glu Leu Tyr Ala Asp lie He Gly 
-10 -5 -11 5 

»• 
*. 

Gly Leu Ala Tyr Thr Met Gly Gly Arg cys Ser val Gly Phe Ala Ala" 

10 15 20 

Thr Asn Ala ser Gly Gin Pro Gly Phe Val Thr Ala Gly His Cys Gly 

25 30 35 

Thr val Gly Thr pro val ser lie Gly Asn Gly Gin Gly val Phe Glu 

Page 14 
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40 45 50 

« 

Arg Ser val Phe Pro Gly Asn Asp Ser Ala Phe val Arg Gly Thr se£ 
55 60 65 

r 

s 

Asn Phe Thr Leu Thr Asn Leu val ser Arg Tyr Asn Thr Gly Gly Ty^r 
70 75 • . 80 85 

Ala Thr val Ser Gly Ser Ser Gin Ala Ala lie Gly Ser Gin lie Cys 

90 95 100 

Arg Ser Gly Ser Thr Thr Gly Trp His cys Gly Thr Val Gin Ala Arg 

105 110 115 

Gly Gin Thr Val Ser Tyr Pro Gin Gly Thr val Gin Asn Leu Thr Arg 
120 125 130 

Thr Asn Val cys Ala Glu Pro Gly Asp ser Gly Gly Ser Phe He Ser 
135 140 145 

Gly Ser Gin Ala Gin Gly val Thr ser Gly Gly Ser Gly Asn cys ser 
150 155 160 165 

Phe Gly Gly Thr Thr Tyr Tyr Gin Glu val Asn Pro Met Leu ser Ser? 

170 175 180 ! 

Trp Gly Leu Thr Leu Arg Thr K- 

185 
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1/26 



10 

ADIIGGLAYT 
ADIIGGLAYY 
ADIIGGLAYT 
ADIIGGLAYT 
ADIIGGLAYT 
10 



20 

MGGRCSVGFA 
MGGRCSVGFA 
MGGRCSVGFA 
MGGRCSVGFA 
MGGRCSVGFA 
20 



30 

ATNAAGQPGF 
ATNSAGQPGF 
ATNAAGQPGF 
ATNAAGQPGF 
ATNASGQPGF 
30 



40 

VTAGHCGRVG 
VTAGHCGTVG 
VTAGHCGRVG 
VTAGHCGRVG 
VTAGHCGTVG 
40 



50 

TQVTIGNGRG 
TGVTIGNGTG 
TQVSIGNGQG 
TQVTIGNGRG 
TPVSIGNGQG 
50 



Mocftaget 
f 0 OKI 2003 
PVS 



Protease 10 
Protease 18 
Protease 11 
Protease 35 
Protease 08 



60 

VFEQSVFPGN 
TFQNSVFPGN 
VFEQSIFPGN 
VFEQSIFPGN 
VFERSVFPGN 
60 



70 

DAAFVRGTSN 
DAAFVRGTSN 
DAAFVRGTSN 
DAAFVRGTSN 
DSAFVRGTSN 
70 



80 

FTLTNLVSRY 
FTLTNLVSRY 
FTLTNLVSRY 
FTLTNLVSRY 
FTLTNLVSRY 
80 



90 

NTGGYATVAG 
NSGGYQSVTG 
NTGGYATVAG 
NTGGYATVAG 
NTGGYATVSG 
90 



100 

HNQAPIGSSV 
TSQAPAGSAV 
HNQAPIGSSV 
HNQAPIGSSV 
SSQAAIGSQI 
100 



Protease 10 
Protease 18 
Protease 11 
Protease 35 
Protease 08 



110 

CRSGSTTGWH 
CRSGSTTGWH 
CRSGSTTGWH 
CRSGSTTGWH 
CRSGSTTGWH 
110 



160 

TQAQGVTSGG 
SQAQGVTSGG 
NQAQGVTSGG 
NQAQGVTSGG 
SQAQGVTSGG 
160 



120 

CGTIQARGQS 
CGTIQARNQT 
CGTIQARGQS 
CGTIQARGQS 
CGTVQARGQT 
120 



170 

SGNCRTGGTT 
SGNCSVGGTT 
SGNCRTGGTT 
SGNCRTGGTT 
SGNCSFGGTT 
170 



130 

VSYPEGTVTN 
VRYPQGTVYS 
VSYPEGTVTN 
VSYPEGTVTN 
VSYPQGTVQN 
130 



180 

FYQEVTPMVN 
YYQEVTPMIN 
FYQEVTPMVN 
FYQEVTPMVN 
YYQEVNPMLS 
180 



140 

MTRTTVCAEP 
LTRTNVCAEP 
MTRTTVCAEP 
MTRTTVCAEP 
LTRTNVCAEP 
140 

; 188 

SWGVRLRT 
SttGVRIRT! 
SWGVRLRT 
SWGVRLRT 
SWGLTLRT 
.188 

f 



150 

GDSGGSYISG 
GDSGGSFISG 
GDSGGSYISG 
GDSGGSYISG 
GDSGGSFISG 
150 
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Protease 18 
Protease 11 
Protease 35 
Protease 08 



Protease 10 

'i.'t 

Proteose 18 
Protease 11 
Protease 35 
Protease 08 
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